Arrow Research search

Author name cluster

Ronny Luss

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers
2 author rows

Possible papers

16

ICLR Conference 2025 Conference Paper

Shedding Light on Time Series Classification using Interpretability Gated Networks

  • Yunshi Wen
  • Tengfei Ma 0001
  • Ronny Luss
  • Debarun Bhattacharjya
  • Achille Fokoue
  • A. Agung Julius

In time-series classification, interpretable models can bring additional insights but be outperformed by deep models since human-understandable features have limited expressivity and flexibility. In this work, we present InterpGN, a framework that integrates an interpretable model and a deep neural network. Within this framework, we introduce a novel gating function design based on the confidence of the interpretable expert, preserving interpretability for samples where interpretable features are significant while also identifying samples that require additional expertise. For the interpretable expert, we incorporate shapelets to effectively model shape-level features for time-series data. We introduce a variant of Shapelet Transforms to build logical predicates using shapelets. Our proposed model achieves comparable performance with state-of-the-art deep learning models while additionally providing interpretable classifiers for various benchmark datasets. We further show that our models improve on quantitative shapelet quality and interpretability metrics over existing shapelet-learning formulations. Finally, we show that our models can integrate additional advanced architectures and be applied to real-world tasks beyond standard benchmarks such as the MIMIC-III and time series extrinsic regression datasets.

IJCAI Conference 2024 Conference Paper

ComVas: Contextual Moral Values Alignment System

  • Inkit Padhi
  • Pierre Dognin
  • Jesus Rios
  • Ronny Luss
  • Swapnaja Achintalwar
  • Matthew Riemer
  • Miao Liu
  • Prasanna Sattigeri

In contemporary society, the integration of artificial intelligence (AI) systems into various aspects of daily life raises significant ethical concerns. One critical aspect is to ensure that AI systems align with the moral values of the endusers. To that end, we introduce the Contextual Moral Value Alignment System, ComVas. Unlike traditional AI systems which have moral values predefined, ComVas empowers users to dynamically select and customize the desired moral values thereby guiding the system’s decision-making process. Through a user-friendly interface, individuals can specify their preferred morals, allowing the system to steer the model’s responses and actions accordingly. ComVas utilizes advanced natural language processing techniques to engage with the users in a meaningful dialogue, understanding their preferences, and reasoning about moral dilemmas in diverse contexts. This demo article showcases the functionality of ComVas, illustrating its potential to foster ethical decision-making in AI systems while respecting individual autonomy and promoting user-centric design principles.

TMLR Journal 2024 Journal Article

To Transfer or Not to Transfer: Suppressing Concepts from Source Representations

  • Vijay Sadashivaiah
  • Keerthiram Murugesan
  • Ronny Luss
  • Pin-Yu Chen
  • Chris Sims
  • James Hendler
  • Amit Dhurandhar

With the proliferation of large pre-trained models in various domains, transfer learning has gained prominence where intermediate representations from these models can be leveraged to train better (target) task-specific models, with possibly limited labeled data. Although transfer learning can be beneficial in many applications, it can transfer undesirable information to target tasks that may severely curtail its performance in the target domain or raise ethical concerns related to privacy and/or fairness. In this paper, we propose a novel approach for suppressing the transfer of user-determined semantic concepts (viz. color, glasses, etc.) in intermediate source representations to target tasks without retraining the source model which can otherwise be expensive or even infeasible. Notably, we tackle a bigger challenge in the input data as a given intermediate source representation is biased towards the source task, thus possibly further entangling the desired concepts. We evaluate our approach qualitatively and quantitatively in the visual domain showcasing its efficacy for classification and generative source models. Finally, we provide a concept selection approach that automatically suppresses the undesirable concepts.

TMLR Journal 2024 Journal Article

When Stability meets Sufficiency: Informative Explanations that do not Overwhelm

  • Ronny Luss
  • Amit Dhurandhar

Recent studies evaluating various criteria for explainable artificial intelligence (XAI) suggest that fidelity, stability, and comprehensibility are among the most important metrics considered by users of AI across a diverse collection of usage contexts. We consider these criteria as applied to feature-based attribution methods, which are amongst the most prevalent in XAI literature. Going beyond standard correlation, methods have been proposed that highlight what should be minimally sufficient to justify the classification of an input (viz. pertinent positives). While minimal sufficiency is an attractive property akin to comprehensibility, the resulting explanations are often too sparse for a human to understand and evaluate the local behavior of the model. To overcome these limitations, we incorporate the criteria of stability and fidelity and propose a novel method called Path-Sufficient Explanations Method (PSEM) that outputs a sequence of stable and sufficient explanations for a given input of strictly decreasing size (or value) -- from original input to a minimally sufficient explanation -- which can be thought to trace the local boundary of the model in a stable manner, thus providing better intuition about the local model behavior for the specific input. We validate these claims, both qualitatively and quantitatively, with experiments that show the benefit of PSEM across three modalities (image, tabular and text) as well as versus other path explanations. A user study depicts the strength of the method in communicating the local behavior, where (many) users are able to correctly determine the prediction made by a model.

AAAI Conference 2023 Conference Paper

Local Explanations for Reinforcement Learning

  • Ronny Luss
  • Amit Dhurandhar
  • Miao Liu

Many works in explainable AI have focused on explaining black-box classification models. Explaining deep reinforcement learning (RL) policies in a manner that could be understood by domain users has received much less attention. In this paper, we propose a novel perspective to understanding RL policies based on identifying important states from automatically learned meta-states. The key conceptual difference between our approach and many previous ones is that we form meta-states based on locality governed by the expert policy dynamics rather than based on similarity of actions, and that we do not assume any particular knowledge of the underlying topology of the state space. Theoretically, we show that our algorithm to find meta-states converges and the objective that selects important states from each meta-state is submodular leading to efficient high quality greedy selection. Experiments on four domains (four rooms, door-key, minipacman, and pong) and a carefully conducted user study illustrate that our perspective leads to better understanding of the policy. We conjecture that this is a result of our meta-states being more intuitive in that the corresponding important states are strong indicators of tractable intermediate goals that are easier for humans to interpret and follow.

IJCAI Conference 2023 Conference Paper

Probabilistic Rule Induction from Event Sequences with Logical Summary Markov Models

  • Debarun Bhattacharjya
  • Oktie Hassanzadeh
  • Ronny Luss
  • Keerthiram Murugesan

Event sequences are widely available across application domains and there is a long history of models for representing and analyzing such datasets. Summary Markov models are a recent addition to the literature that help identify the subset of event types that influence event types of interest to a user. In this paper, we introduce logical summary Markov models, which are a family of models for event sequences that enable interpretable predictions through logical rules that relate historical predicates to the probability of observing an event type at any arbitrary position in the sequence. We illustrate their connection to prior parametric summary Markov models as well as probabilistic logic programs, and propose new models from this family along with efficient greedy search algorithms for learning them from data. The proposed models outperform relevant baselines on most datasets in an empirical investigation on a probabilistic prediction task. We also compare the number of influencers that various logical summary Markov models learn on real-world datasets, and conduct a brief exploratory qualitative study to gauge the promise of such symbolic models around guiding large language models for predicting societal events.

ICLR Conference 2023 Conference Paper

Weighted Clock Logic Point Process

  • Ruixuan Yan
  • Yunshi Wen
  • Debarun Bhattacharjya
  • Ronny Luss
  • Tengfei Ma 0001
  • Achille Fokoue
  • A. Agung Julius

Datasets involving multivariate event streams are prevalent in numerous applications. We present a novel framework for modeling temporal point processes called clock logic neural networks (CLNN) which learn weighted clock logic (wCL) formulas as interpretable temporal rules by which some events promote or inhibit other events. Specifically, CLNN models temporal relations between events using conditional intensity rates informed by a set of wCL formulas, which are more expressive than related prior work. Unlike conventional approaches of searching for generative rules through expensive combinatorial optimization, we design smooth activation functions for components of wCL formulas that enable a continuous relaxation of the discrete search space and efficient learning of wCL formulas using gradient-based methods. Experiments on synthetic datasets manifest our model's ability to recover the ground-truth rules and improve computational efficiency. In addition, experiments on real-world datasets show that our models perform competitively when compared with state-of-the-art models.

ICLR Conference 2022 Conference Paper

Auto-Transfer: Learning to Route Transferable Representations

  • Keerthiram Murugesan
  • Vijay Sadashivaiah
  • Ronny Luss
  • Karthikeyan Shanmugam 0001
  • Pin-Yu Chen
  • Amit Dhurandhar

Knowledge transfer between heterogeneous source and target networks and tasks has received a lot of attention in recent times as large amounts of quality labeled data can be difficult to obtain in many applications. Existing approaches typically constrain the target deep neural network (DNN) feature representations to be close to the source DNNs feature representations, which can be limiting. We, in this paper, propose a novel adversarial multi-armed bandit approach that automatically learns to route source representations to appropriate target representations following which they are combined in meaningful ways to produce accurate target models. We see upwards of 5\% accuracy improvements compared with the state-of-the-art knowledge transfer methods on four benchmark (target) image datasets CUB200, Stanford Dogs, MIT67, and Stanford40 where the source dataset is ImageNet. We qualitatively analyze the goodness of our transfer scheme by showing individual examples of the important features focused on by our target network at different layers compared with the (closest) competitors. We also observe that our improvement over other methods is higher for smaller target datasets making it an effective tool for small data applications that may benefit from transfer learning.

JMLR Journal 2020 Journal Article

AI Explainability 360: An Extensible Toolkit for Understanding Data and Machine Learning Models

  • Vijay Arya
  • Rachel K. E. Bellamy
  • Pin-Yu Chen
  • Amit Dhurandhar
  • Michael Hind
  • Samuel C. Hoffman
  • Stephanie Houde
  • Q. Vera Liao

As artificial intelligence algorithms make further inroads in high-stakes societal applications, there are increasing calls from multiple stakeholders for these algorithms to explain their outputs. To make matters more challenging, different personas of consumers of explanations have different requirements for explanations. Toward addressing these needs, we introduce AI Explainability 360, an open-source Python toolkit featuring ten diverse and state-of-the-art explainability methods and two evaluation metrics. Equally important, we provide a taxonomy to help entities requiring explanations to navigate the space of interpretation and explanation methods, not only those in the toolkit but also in the broader literature on explainability. For data scientists and other users of the toolkit, we have implemented an extensible software architecture that organizes methods according to their place in the AI modeling pipeline. The toolkit is not only the software, but also guidance material, tutorials, and an interactive web demo to introduce AI explainability to different audiences. Together, our toolkit and taxonomy can help identify gaps where more explainability methods are needed and provide a platform to incorporate them as they are developed. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2020. ( edit, beta )

ICML Conference 2020 Conference Paper

Enhancing Simple Models by Exploiting What They Already Know

  • Amit Dhurandhar
  • Karthikeyan Shanmugam 0001
  • Ronny Luss

There has been recent interest in improving performance of simple models for multiple reasons such as interpretability, robust learning from small data, deployment in memory constrained settings as well as environmental considerations. In this paper, we propose a novel method SRatio that can utilize information from high performing complex models (viz. deep neural networks, boosted trees, random forests) to reweight a training dataset for a potentially low performing simple model of much lower complexity such as a decision tree or a shallow network enhancing its performance. Our method also leverages the per sample hardness estimate of the simple model which is not the case with the prior works which primarily consider the complex model’s confidences/predictions and is thus conceptually novel. Moreover, we generalize and formalize the concept of attaching probes to intermediate layers of a neural network to other commonly used classifiers and incorporate this into our method. The benefit of these contributions is witnessed in the experiments where on 6 UCI datasets and CIFAR-10 we outperform competitors in a majority (16 out of 27) of the cases and tie for best performance in the remaining cases. In fact, in a couple of cases, we even approach the complex model’s performance. We also conduct further experiments to validate assertions and intuitively understand why our method works. Theoretically, we motivate our approach by showing that the weighted loss minimized by simple models using our weighting upper bounds the loss of the complex model.

ICML Conference 2019 Conference Paper

Beyond Backprop: Online Alternating Minimization with Auxiliary Variables

  • Anna Choromanska
  • Benjamin Cowen
  • Sadhana Kumaravel
  • Ronny Luss
  • Mattia Rigotti
  • Irina Rish
  • Paolo Diachille
  • Viatcheslav Gurev

Despite significant recent advances in deep neural networks, training them remains a challenge due to the highly non-convex nature of the objective function. State-of-the-art methods rely on error backpropagation, which suffers from several well-known issues, such as vanishing and exploding gradients, inability to handle non-differentiable nonlinearities and to parallelize weight-updates across layers, and biological implausibility. These limitations continue to motivate exploration of alternative training algorithms, including several recently proposed auxiliary-variable methods which break the complex nested objective function into local subproblems. However, those techniques are mainly offline (batch), which limits their applicability to extremely large datasets, as well as to online, continual or reinforcement learning. The main contribution of our work is a novel online (stochastic/mini-batch) alternating minimization (AM) approach for training deep neural networks, together with the first theoretical convergence guarantees for AM in stochastic settings and promising empirical results on a variety of architectures and datasets.

NeurIPS Conference 2018 Conference Paper

Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives

  • Amit Dhurandhar
  • Pin-Yu Chen
  • Ronny Luss
  • Chun-Chen Tu
  • Paishun Ting
  • Karthikeyan Shanmugam
  • Payel Das

In this paper we propose a novel method that provides contrastive explanations justifying the classification of an input by a black box classifier such as a deep neural network. Given an input we find what should be minimally and sufficiently present (viz. important object pixels in an image) to justify its classification and analogously what should be minimally and necessarily \emph{absent} (viz. certain background pixels). We argue that such explanations are natural for humans and are used commonly in domains such as health care and criminology. What is minimally but critically \emph{absent} is an important part of an explanation, which to the best of our knowledge, has not been explicitly identified by current explanation methods that explain predictions of neural networks. We validate our approach on three real datasets obtained from diverse domains; namely, a handwritten digits dataset MNIST, a large procurement fraud dataset and a brain activity strength dataset. In all three cases, we witness the power of our approach in generating precise explanations that are also easy for human experts to understand and evaluate.

NeurIPS Conference 2018 Conference Paper

Improving Simple Models with Confidence Profiles

  • Amit Dhurandhar
  • Karthikeyan Shanmugam
  • Ronny Luss
  • Peder Olsen

In this paper, we propose a new method called ProfWeight for transferring information from a pre-trained deep neural network that has a high test accuracy to a simpler interpretable model or a very shallow network of low complexity and a priori low test accuracy. We are motivated by applications in interpretability and model deployment in severely memory constrained environments (like sensors). Our method uses linear probes to generate confidence scores through flattened intermediate representations. Our transfer method involves a theoretically justified weighting of samples during the training of the simple model using confidence scores of these intermediate layers. The value of our method is first demonstrated on CIFAR-10, where our weighting method significantly improves (3-4\%) networks with only a fraction of the number of Resnet blocks of a complex Resnet model. We further demonstrate operationally significant results on a real manufacturing problem, where we dramatically increase the test accuracy of a CART model (the domain standard) by roughly $13\%$.

UAI Conference 2016 Conference Paper

Interpretable Policies for Dynamic Product Recommendations

  • Marek Petrik
  • Ronny Luss

In many applications, it may be better to compute a good interpretable policy instead of a complex optimal one. For example, a recommendation engine might perform better when accounting for user profiles, but in the absence of such loyalty data, assumptions would have to be made that increase the complexity of the recommendation policy. A simple greedy recommendation could be implemented based on aggregated user data, but another simple policy can improve on this by accounting for the fact that users come from different segments of a population. In this paper, we study the problem of computing an optimal policy that is interpretable. In particular, we consider a policy to be interpretable if the decisions (e. g. , recommendations) depend only on a small number of simple state attributes (e. g. , the currently viewed product). This novel model is a general Markov decision problem with action constraints over states. We show that this problem is NP hard and develop a Mixed Integer Linear Programming formulation that gives an exact solution when policies are restricted to being deterministic. We demonstrate the effectiveness of the approach on a real-world business case for a European tour operator’s recommendation engine.

NeurIPS Conference 2010 Conference Paper

Decomposing Isotonic Regression for Efficiently Solving Large Problems

  • Ronny Luss
  • Saharon Rosset
  • Moni Shahar

A new algorithm for isotonic regression is presented based on recursively partitioning the solution space. We develop efficient methods for each partitioning subproblem through an equivalent representation as a network flow problem, and prove that this sequence of partitions converges to the global solution. These network flow problems can further be decomposed in order to solve very large problems. Success of isotonic regression in prediction and our algorithm's favorable computational properties are demonstrated through simulated examples as large as 2x10^5 variables and 10^7 constraints.

NeurIPS Conference 2007 Conference Paper

Support Vector Machine Classification with Indefinite Kernels

  • Ronny Luss
  • Alexandre d'Aspremont

In this paper, we propose a method for support vector machine classification using indefinite kernels. Instead of directly minimizing or stabilizing a nonconvex loss function, our method simultaneously finds the support vectors and a proxy kernel matrix used in computing the loss. This can be interpreted as a robust classification problem where the indefinite kernel matrix is treated as a noisy observation of the true positive semidefinite kernel. Our formulation keeps the problem convex and relatively large problems can be solved efficiently using the analytic center cutting plane method. We compare the performance of our technique with other methods on several data sets.