Arrow Research search

Author name cluster

Ira Assent

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
2 author rows

Possible papers

7

ICLR Conference 2025 Conference Paper

FairDen: Fair Density-Based Clustering

  • Lena Krieger 0001
  • Anna Beer 0001
  • Pernille Matthews
  • Anneka Myrup Thiesson
  • Ira Assent

Fairness in data mining tasks like clustering has recently become an increasingly important aspect. However, few clustering algorithms exist that focus on fair groupings of data with sensitive attributes. Including fairness in the clustering objective is especially hard for density-based clustering, as it does not directly optimize a closed form objective like centroid-based or spectral methods. This paper introduces FairDen, the first fair, density-based clustering algorithm. We capture the dataset's density-connectivity structure in a similarity matrix that we manipulate to encourage a balanced clustering. In contrast to state-of-the-art, FairDen inherently handles categorical attributes, noise, and data with several sensitive attributes or groups. We show that FairDen finds meaningful and fair clusters in extensive experiments.

AAAI Conference 2025 Conference Paper

InteDisUX: Intepretation-Guided Discriminative User-Centric Explanation for Time Series

  • Viet-Hung Tran
  • Zichi Zhang
  • Tuan Dung Pham
  • Ngoc Phu Doan
  • Anh-Tuan Hoang
  • Peixin Li
  • Hans Vandierendonck
  • Ira Assent

Explanation for deep learning models on time series classification (TSC) tasks is an important and challenging problem. Most existing approaches use attribution maps to explain outcomes. However, they have limitations in generating explanations that are well-aligned with humans's perceptions. Recently LIME-based approaches provide a more meaningful explanation via segmenting the data. However, these approaches are still suffering from the processes of segment generations and evaluations. In this paper, we propose a novel time series explanation approach called InteDisUX to overcome these problems. Our technique utilizes the segment-level integrated gradient (SIG) for calculating importance scores for an initial set of small and equal segments before iteratively merge two consecutive ones to create better explanations under a unique greedy strategy guided by two new proposed metrics including discrimination and faithfulness gains. By this way, our method does not depend on predefined segments like others while being robusts to instability, poor local fidelity and data imbalance like LIME-based methods. Furthermore, InteDisUX is the first work to use the model's information to improve the set of segments} for time series explanation. Extensive experiments show that our method outperforms LIME-based ones in 12 datasets in terms of faithfulness and 8/12 datasets in terms of robustness.

NeurIPS Conference 2025 Conference Paper

LeapFactual: Reliable Visual Counterfactual Explanation Using Conditional Flow Matching

  • Zhuo Cao
  • Xuan Zhao
  • Lena Krieger
  • Hanno Scharr
  • Ira Assent

The growing integration of machine learning (ML) and artificial intelligence (AI) models into high-stakes domains such as healthcare and scientific research calls for models that are not only accurate but also interpretable. Among the existing explainable methods, counterfactual explanations offer interpretability by identifying minimal changes to inputs that would alter a model’s prediction, thus providing deeper insights. However, current counterfactual generation methods suffer from critical limitations, including gradient vanishing, discontinuous latent spaces, and an overreliance on the alignment between learned and true decision boundaries. To overcome these limitations, we propose LeapFactual, a novel counterfactual explanation algorithm based on conditional flow matching. LeapFactual generates reliable and informative counterfactuals, even when true and learned decision boundaries diverge. LeapFactual is not limited to models with differentiable loss functions. It can even handle human-in-the-loop systems, expanding the scope of counterfactual explanations to domains that require the participation of human annotators, such as citizen science. We provide extensive experiments on benchmark and real-world datasets highlighting that LeapFactual generates accurate and in-distribution counterfactual explanations that offer actionable insights. We observe, for instance, that our reliable counterfactual samples with labels aligning to ground truth can be beneficially used as new training data to enhance the model. The proposed method is diversely applicable and enhances scientific knowledge discovery as well as non-expert interpretability.

NeurIPS Conference 2025 Conference Paper

MIX: A Multi-view Time-Frequency Interactive Explanation Framework for Time Series Classification

  • Viet-Hung Tran
  • Ngoc Phu Doan
  • Zichi Zhang
  • Tuan Pham
  • Phi Hung Nguyen
  • Xuan Nguyen
  • Hans Vandierendonck
  • Ira Assent

Deep learning models for time series classification (TSC) have achieved impressive performance, but explaining their decisions remains a significant challenge. Existing post-hoc explanation methods typically operate solely in the time domain and from a single-view perspective, limiting both faithfulness and robustness. In this work, we propose MIX (Multi-view Time-Frequency Interactive EXplanation Framework), a novel framework that helps to explain deep learning models in a multi-view setting by leveraging multi-resolution, time-frequency views constructed using the Haar Discrete Wavelet Transform (DWT). MIX introduces an interactive cross-view refinement scheme, where explanation's information from one view is propagated across views to enhance overall interpretability. To align with user-preferred perspectives, we propose a greedy selection strategy that traverses the multi-view space to identify the most informative features. Additionally, we present OSIGV, a user-aligned segment-level attribution mechanism based on overlapping windows for each view, and introduce keystone-first IG, a method that refines explanations in each view using additional information from another view. Extensive experiments across multiple TSC benchmarks and model architectures demonstrate that MIX significantly outperforms state-of-the-art (SOTA) methods in terms of explanation faithfulness and robustness.

TMLR Journal 2025 Journal Article

Random Erasing vs. Model Inversion: A Promising Defense or a False Hope?

  • Viet-Hung Tran
  • Ngoc-Bao Nguyen
  • Son T. Mai
  • Hans Vandierendonck
  • Ira Assent
  • Alex Kot
  • Ngai-Man Cheung

Model Inversion (MI) attacks pose a significant privacy threat by reconstructing private training data from machine learning models. While existing defenses primarily concentrate on model-centric approaches, the impact of data on MI robustness remains largely unexplored. In this work, we explore Random Erasing (RE)—a technique traditionally used for improving model generalization under occlusion—and uncover its surprising effectiveness as a defense against MI attacks. Specifically, our novel feature space analysis shows that model trained with RE-images introduces a significant discrepancy between the features of MI-reconstructed images and those of the private data. At the same time, features of private images remain distinct from other classes and well-separated from different classification regions. These effects collectively degrade MI reconstruction quality and attack accuracy while maintaining reasonable natural accuracy. Furthermore, we explore two critical properties of RE including Partial Erasure and Random Location. First, Partial Erasure prevents the model from observing entire objects during training, and we find that this has significant impact on MI, which aims to reconstruct the entire objects. Second, the Random Location of erasure plays a crucial role in achieving a strong privacy-utility trade-off. Our findings highlight RE as a simple yet effective defense mechanism that can be easily integrated with existing privacy-preserving techniques. Extensive experiments of 37 setups demonstrate that our method achieves SOTA performance in privacy-utility tradeoff. The results consistently demonstrate the superiority of our defense over existing defenses across different MI attacks, network architectures, and attack configurations. For the first time, we achieve significant degrade in attack accuracy without decrease in utility for some configurations. Our code and additional results are available at: https://ngoc-nguyen-0.github.io/MIDRE/

NeurIPS Conference 2025 Conference Paper

Ultrametric Cluster Hierarchies: I Want ‘em All!

  • Andrew Draganov
  • Pascal Weber
  • Rasmus Jørgensen
  • Anna Beer
  • Claudia Plant
  • Ira Assent

Hierarchical clustering is a powerful tool for exploratory data analysis, organizing data into a tree of clusterings from which a partition can be chosen. This paper generalizes these ideas by proving that, for any reasonable hierarchy, one can optimally solve any center-based clustering objective over it (such as $k$-means). Moreover, these solutions can be found exceedingly quickly and are *themselves* necessarily hierarchical. Thus, given a cluster tree, we show that one can quickly access a plethora of new, equally meaningful hierarchies. Just as in standard hierarchical clustering, one can then choose any desired partition from these new hierarchies. We conclude by verifying the utility of our proposed techniques across datasets, hierarchies, and partitioning schemes.

IJCAI Conference 2023 Conference Paper

ActUp: Analyzing and Consolidating tSNE and UMAP

  • Andrew Draganov
  • Jakob Jørgensen
  • Katrine Scheel
  • Davide Mottin
  • Ira Assent
  • Tyrus Berry
  • Cigdem Aslay

TSNE and UMAP are popular dimensionality reduction algorithms due to their speed and interpretable low-dimensional embeddings. Despite their popularity, however, little work has been done to study their full span of differences. We theoretically and experimentally evaluate the space of parameters in the TSNE and UMAP algorithms and observe that a single one -- the normalization -- is responsible for switching between them. This, in turn, implies that a majority of the algorithmic differences can be toggled without affecting the embeddings. We discuss the implications this has on several theoretic claims behind UMAP, as well as how to reconcile them with existing TSNE interpretations. Based on our analysis, we provide a method (GDR) that combines previously incompatible techniques from TSNE and UMAP and can replicate the results of either algorithm. This allows our method to incorporate further improvements, such as an acceleration that obtains either method's outputs faster than UMAP. We release improved versions of TSNE, UMAP, and GDR that are fully plug-and-play with the traditional libraries.