Author name cluster

Jilles Vreeken

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers

2 author rows

AAAI Conference 2026 Conference Paper

Causal Discovery from Interval-Based Event Sequences

Lénaïg Cornanguer
Joscha Cüppers
Jilles Vreeken

In this paper we address the problem of discovering causal relationships from observational event sequence data. Existing methods typically assume that events are instantaneous point events, however in many real-world settings, events have duration. For example, in healthcare, a patient's symptoms may persist over a time interval and influence clinical actions while ongoing. To address this, we introduce a causal model for interval-based event sequences that captures rich causal structures, including interactions between events and causal mechanisms that depend on whether other events are ongoing. We prove that our model is identifiable in the limit and present a practical causal discovery algorithm, Niagara, grounded in the algorithmic Markov condition. To select among candidate models, we employ a minimum description length (MDL) criterion, enabling robust inference even with limited data. We validate our approach on synthetic and real data and demonstrate its utility on a real-world medical case study, where it uncovers meaningful causal relationships from noisy, interval-based event data.

PDF Details DOI

AAAI Conference 2026 Conference Paper

SEQRET: Mining Rule Sets from Event Sequences

Aleena Siji
Joscha Cüppers
Osman Mian
Jilles Vreeken

Summarizing event sequences is a key aspect of data mining. Most existing methods neglect conditional dependencies and focus on discovering sequential patterns only. In this paper, we study the problem of discovering both conditional and unconditional dependencies from event sequences. We do so by discovering rules of the form X --> Y where X and Y are sequential patterns. Rules like these are simple to understand and provide a clear description of the relation between the antecedent and the consequent. To discover succinct and non-redundant sets of rules we formalize the problem in terms of the Minimum Description Length principle. As the search space is enormous and does not exhibit helpful structure, we propose the SEQRET method to discover high-quality rule sets in practice. Through extensive empirical evaluation we show that unlike the state of the art, SEQRET ably recovers the ground truth on synthetic datasets and finds useful rules from real datasets.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Causal Mixture Models: Characterization and Discovery

Sarah Mameche
Janis Kalofolias
Jilles Vreeken

Real-world datasets are often a combination of unobserved subpopulations that follow distinct causal generating processes. In an observational study, for example, participants may fall into unknown groups that either (a) respond effectively to a drug, or (b) show no response due to drug resistance. Not accounting for such heterogeneity then risks biased estimates of drug effectiveness. In this work, we formulate this setting through a causal mixture model, in which the data-generating process of each variable depends on latent group membership (a or b). Specifically, we model each variable as a mixture of structural causal equation models, where latent categorical (mixing) variables index assignment to subpopulations. Unlike prior work, the approach allows for multiple independent mixing variables, each affecting distinct sets of observed variables. To infer both the graph, mixing variables, and assignments jointly, we integrate mixture modeling into score-based causal discovery; show theoretically that the resulting scoring criterion is consistent; and demonstrate empirically that the resulting causal discovery approach discovers the causal model in synthetic and real-world evaluations.

PDF Details

AAAI Conference 2025 Conference Paper

Federated Binary Matrix Factorization Using Proximal Optimization

Sebastian Dalleiger
Jilles Vreeken
Michael Kamp

Identifying informative components in binary data is an essential task in many application areas, including life sciences, social sciences, and recommendation systems. Boolean matrix factorization (BMF) is a family of methods that performs this task by factorizing the data into dense factor matrices. In real-world settings, the data is often distributed across stakeholders and required to stay private, prohibiting the straightforward application of BMF. To adapt BMF to this context, we approach the problem from a federated-learning perspective, building on a state-of-the-art continuous binary matrix factorization relaxation to BMF that enables efficient gradient-based optimization. Our approach only needs to share the relaxed component matrices, which are aggregated centrally using a proximal operator that regularizes for binary outcomes. We show the convergence of our federated proximal gradient descent algorithm and provide differential privacy guarantees. Our extensive empirical evaluation shows that our algorithm outperforms, in quality and efficacy, federation schemes of state-of-the-art BMF methods on a diverse set of real-world and synthetic data.

PDF Details DOI

AAAI Conference 2025 Conference Paper

From Your Block to Our Block: How to Find Shared Structure Between Stochastic Block Models over Multiple Graphs

Iiro Kumpulainen
Sebastian Dalleiger
Jilles Vreeken
Nikolaj Tatti

Stochastic Block Models (SBMs) are a popular approach to modeling single real-world graphs. The key idea of SBMs is to partition the vertices of the graph into blocks with similar edge densities within, as well as between different blocks. However, what if we are given not one but multiple graphs that are unaligned and of different sizes? How can we find out if these graphs share blocks with similar connectivity structures? In this paper, we propose the shared stochastic block modeling (SSBM) problem, in which we model n graphs using SBMs that share parameters of s blocks. We show that fitting an SSBM is NP-hard, and consider two approaches to fit good models in practice. In the first, we directly maximize the likelihood of the shared model using a Markov chain Monte Carlo algorithm. In the second, we first fit an SBM for each graph and then select which blocks to share. We propose an integer linear program to find the optimal shared blocks and to scale to large numbers of blocks, we propose a fast greedy algorithm. Through extensive empirical evaluation on synthetic and real-world data, we show that our methods work well in practice.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Neural Rule Lists: Learning Discretizations, Rules, and Order in One Go

Sascha Xu
Nils Philipp Walter
Jilles Vreeken

Interpretable machine learning is essential in high-stakes domains like healthcare. Rule lists are a popular choice due to their transparency and accuracy, but learning them effectively remains a challenge. Existing methods require feature pre-discretization, constrain rule complexity or ordering, or struggle to scale. We present NeuRules, a novel end-to-end framework that overcomes these limitations. At its core, NeuRules transforms the inherently combinatorial task of rule list learning into a differentiable optimization problem, enabling gradient-based learning. It simultaneously discovers feature conditions, assembles them into conjunctive rules, and determines their order—without pre-processing or manual constraints. A key contribution here is a gradient shaping technique that steers learning toward sparse rules with strong predictive performance. To produce ordered lists, we introduce a differentiable relaxation that, through simulated annealing, converges to a strict rule list. Extensive experiments show that NeuRules consistently outperforms combinatorial and neural baselines on binary as well as multi-class classification tasks across a wide range of datasets.

PDF Details

AAAI Conference 2025 Conference Paper

SPACETIME: Causal Discovery from Non-Stationary Time Series

Sarah Mameche
Lénaïg Cornanguer
Urmi Ninad
Jilles Vreeken

Understanding causality is challenging and often complicated by changing causal relationships over time and across environments. Climate patterns, for example, shift over time with recurring seasonal trends, while also depending on geographical characteristics such as ecosystem variability. Existing methods for discovering causal graphs from time series either assume stationarity, do not permit both temporal and spatial distribution changes, or are unaware of locations with the same causal relationships. In this work, we therefore unify the three tasks of causal graph discovery in the non-stationary multi-context setting, of reconstructing temporal regimes, and of partitioning datasets and time intervals into those where invariant causal relationships hold. To construct a consistent score that forms the basis of our method, we employ the Minimum Description Length principle. Our resulting algorithm SPACETIME simultaneously accounts for heterogeneity across space and non-stationarity over time. Given multiple time series, it discovers regime changepoints and a temporal causal graph using non-parametric functional modeling and kernelized discrepancy testing. We also show that our method provides insights into real-world phenomena such as river-runoff measured at different catchments and biosphere-atmosphere interactions across ecosystems.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Causal Discovery from Event Sequences by Local Cause-Effect Attribution

Joscha Cüppers
Sascha Xu
Ahmed Musa
Jilles Vreeken

Sequences of events, such as crashes in the stock market or outages in a network, contain strong temporal dependencies, whose understanding is crucial to react to and influence future events. In this paper, we study the problem of discovering the underlying causal structure from event sequences. To this end, we introduce a new causal model, where individual events of the cause trigger events of the effect with dynamic delays. We show that in contrast to existing methods based on Granger causality, our model is identifiable for both instant and delayed effects. We base our approach on the Algorithmic Markov Condition, by which we identify the true causal network as the one that minimizes the Kolmogorov complexity. As the Kolmogorov complexity is not computable, we instantiate our model using Minimum Description Length and show that the resulting score identifies the causal direction. To discover causal graphs, we introduce the Cascade algorithm, which adds edges in topological order. Extensive evaluation shows that Cascade outperforms existing methods in settings with instantaneous effects, noise, and multiple colliders, and discovers insightful causal graphs on real-world data.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Discovering Sequential Patterns with Predictable Inter-event Delays

Joscha Cüppers
Paul Krieger
Jilles Vreeken

Summarizing sequential data with serial episodes allows non-trivial insight into the data generating process. Existing methods penalize gaps in pattern occurrences equally, regardless of where in the pattern these occur. This results in a strong bias against patterns with long inter-event delays, and in addition that regularity in terms of delays is not rewarded or discovered---even though both aspects provide key insight. In this paper we tackle both these problems by explicitly modeling inter-event delay distributions. That is, we are not only interested in discovering the patterns, but also in describing how many times steps typically occur between their individual events. We formalize the problem in terms of the Minimum Description Length principle, by which we say the best set of patterns is the one that compresses the data best. The resulting optimization problem does not lend itself to exact optimization, and hence we propose Hopper to heuristically mine high quality patterns. Extensive experiments show that Hopper efficiently recovers the ground truth, discovers meaningful patterns from real-world data, and outperforms existing methods in discovering long-delay patterns.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Finding Interpretable Class-Specific Patterns through Efficient Neural Search

Nils Philipp Walter
Jonas Fischer
Jilles Vreeken

Discovering patterns in data that best describe the differences between classes allows to hypothesize and reason about class-specific mechanisms. In molecular biology, for example, these bear the promise of advancing the understanding of cellular processes differing between tissues or diseases, which could lead to novel treatments. To be useful in practice, methods that tackle the problem of finding such differential patterns have to be readily interpretable by domain experts, and scalable to the extremely high-dimensional data. In this work, we propose a novel, inherently interpretable binary neural network architecture Diffnaps that extracts differential patterns from data. Diffnaps is scalable to hundreds of thousands of features and robust to noise, thus overcoming the limitations of current state-of-the-art methods in large-scale applications such as in biology. We show on synthetic and real world data, including three biological applications, that unlike its competitors, Diffnaps consistently yields accurate, succinct, and interpretable class descriptions.

PDF Details DOI

ICML Conference 2024 Conference Paper

Learning Exceptional Subgroups by End-to-End Maximizing KL-Divergence

Sascha Xu
Nils Philipp Walter
Janis Kalofolias
Jilles Vreeken

Finding and describing sub-populations that are exceptional in terms of a target property has important applications in many scientific disciplines, from identifying disadvantaged demographic groups in census data to finding conductive molecules within gold nanoparticles. Current approaches to finding such subgroups require pre-discretized predictive variables, do not permit non-trivial target distributions, do not scale to large datasets, and struggle to find diverse results. To address these limitations, we propose SYFLOW, an end-to-end optimizable approach in which we leverage normalizing flows to model arbitrary target distributions and introduce a novel neural layer that results in easily interpretable subgroup descriptions. We demonstrate on synthetic data, real-world data, and via a case study, that SYFLOW reliably finds highly exceptional subgroups accompanied by insightful descriptions.

Details

AAAI Conference 2024 Conference Paper

What Are the Rules? Discovering Constraints from Data

Boris Wiegand
Dietrich Klakow
Jilles Vreeken

Constraint programming and AI planning are powerful tools for solving assignment, optimization, and scheduling problems. They require, however, the rarely available combination of domain knowledge and mathematical modeling expertise. Learning constraints from exemplary solutions can close this gap and alleviate the effort of modeling. Existing approaches either require extensive user interaction, need exemplary invalid solutions that must be generated by experts at great expense, or show high noise-sensitivity. We aim to find constraints from potentially noisy solutions, without the need of user interaction. To this end, we formalize the problem in terms of the Minimum Description Length (MDL) principle, by which we select the model with the best lossless compression of the data. Solving the problem involves model counting, which is #P-hard to approximate. We therefore propose the greedy URPILS algorithm to find high-quality constraints in practice. Extensive experiments on constraint programming and AI planning benchmark data show URPILS not only finds more accurate and succinct constraints, but also is more robust to noise, and has lower sample complexity than the state of the art.

PDF Details DOI

UAI Conference 2023 Conference Paper

Causal Discovery with Hidden Confounders using the Algorithmic Markov Condition

David Kaltenpoth
Jilles Vreeken

Causal sufficiency is a cornerstone assumption in causal discovery. It is, however, both unlikely to hold in practice as well as unverifiable. When it does not hold, existing methods struggle to return meaningful results. In this paper, we show how to discover the causal network over both observed and unobserved variables. Moreover, we show that the causal model is identifiable in the sparse linear Gaussian case. More generally, we extend the algorithmic Markov condition to include latent confounders. We propose a consistent score based on the Minimum Description Length principle to discover the full causal network, including latent confounders. Based on this score, we develop an effective algorithm that finds those sets of nodes for which the addition of a confounding factor $Z$ is most beneficial, then fits a new causal network over both observed as well as inferred latent variables.

Details

ICLR Conference 2023 Conference Paper

Federated Learning from Small Datasets

Michael Kamp
Jonas Fischer
Jilles Vreeken

Federated learning allows multiple parties to collaboratively train a joint model without having to share any local data. It enables applications of machine learning in settings where data is inherently distributed and undisclosable, such as in the medical domain. Joint training is usually achieved by aggregating local models. When local datasets are small, locally trained models can vary greatly from a globally good model. Bad local models can arbitrarily deteriorate the aggregate model quality, causing federating learning to fail in these settings. We propose a novel approach that avoids this problem by interleaving model aggregation and permutation steps. During a permutation step we redistribute local models across clients through the server, while preserving data privacy, to allow each local model to train on a daisy chain of local datasets. This enables successful training in data-sparse domains. Combined with model aggregation, this approach enables effective learning even if the local datasets are extremely small, while retaining the privacy benefits of federated learning.

Details

AAAI Conference 2023 Conference Paper

Identifying Selection Bias from Observational Data

David Kaltenpoth
Jilles Vreeken

Access to a representative sample from the population is an assumption that underpins all of machine learning. Selection effects can cause observations to instead come from a subpopulation, by which our inferences may be subject to bias. It is therefore important to know whether or not a sample is affected by selection effects. We study under which conditions we can identify selection bias and give results for both parametric and non-parametric families of distributions. Based on these results we develop two practical methods to determine whether or not an observed sample comes from a distribution subject to selection bias. Through extensive evaluation on synthetic and real world data we verify that our methods beat the state of the art both in detecting as well as characterizing selection bias.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Information-Theoretic Causal Discovery and Intervention Detection over Multiple Environments

Osman Mian
Michael Kamp
Jilles Vreeken

Given multiple datasets over a fixed set of random variables, each collected from a different environment, we are interested in discovering the shared underlying causal network and the local interventions per environment, without assuming prior knowledge on which datasets are observational or interventional, and without assuming the shape of the causal dependencies. We formalize this problem using the Algorithmic Model of Causation, instantiate a consistent score via the Minimum Description Length principle, and show under which conditions the network and interventions are identifiable. To efficiently discover causal networks and intervention targets in practice, we introduce the ORION algorithm, which through extensive experiments we show outperforms the state of the art in causal inference over multiple environments.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

Learning Causal Models under Independent Changes

Sarah Mameche
David Kaltenpoth
Jilles Vreeken

In many scientific applications, we observe a system in different conditions in which its components may change, rather than in isolation. In our work, we are interested in explaining the generating process of such a multi-context system using a finite mixture of causal mechanisms. Recent work shows that this causal model is identifiable from data, but is limited to settings where the sparse mechanism shift hypothesis holds and only a subset of the causal conditionals change. As this assumption is not easily verifiable in practice, we study the more general principle that mechanism shifts are independent, which we formalize using the algorithmic notion of independence. We introduce an approach for causal discovery beyond partially directed graphs using Gaussian Process models, and give conditions under which we provably identify the correct causal model. In our experiments, we show that our method performs well in a range of synthetic settings, on realistic gene expression simulations, as well as on real-world cell signaling data.

PDF Details

ICML Conference 2023 Conference Paper

Nonlinear Causal Discovery with Latent Confounders

David Kaltenpoth
Jilles Vreeken

Causal discovery, the task of discovering the causal graph over a set of observed variables $X_1, \ldots, X_m$, is a challenging problem. One of the cornerstone assumptions is that of causal sufficiency: that all common causes of all measured variables have been observed. When it does not hold, causal discovery algorithms making this assumption return networks with many spurious edges. In this paper, we propose a nonlinear causal model involving hidden confounders. We show that it is identifiable from only the observed data and propose an efficient method for recovering this causal model. At the heart of our approach is a variational autoencoder which parametrizes both the causal interactions between observed variables as well as the influence of the unobserved confounders. Empirically we show that it outperforms other state-of-the-art methods for causal discovery under latent confounding on synthetic and real-world data.

Details

AAAI Conference 2022 Conference Paper

Differentially Describing Groups of Graphs

Corinna Coupette
Sebastian Dalleiger
Jilles Vreeken

How does neural connectivity in autistic children differ from neural connectivity in healthy children or autistic youths? What patterns in global trade networks are shared across classes of goods, and how do these patterns change over time? Answering questions like these requires us to differentially describe groups of graphs: Given a set of graphs and a partition of these graphs into groups, discover what graphs in one group have in common, how they systematically differ from graphs in other groups, and how multiple groups of graphs are related. We refer to this task as graph group analysis, which seeks to describe similarities and differences between graph groups by means of statistically significant subgraphs. To perform graph group analysis, we introduce GRAGRA, which uses maximum entropy modeling to identify a non-redundant set of subgraphs with statistically significant associations to one or more graph groups. Through an extensive set of experiments on a wide range of synthetic and real-world graph groups, we confirm that GRAGRA works well in practice.

PDF Details

AAAI Conference 2022 Conference Paper

Discovering Interpretable Data-to-Sequence Generators

Boris Wiegand
Dietrich Klakow
Jilles Vreeken

We study the problem of predicting an event sequence given some meta data. In particular, we are interested in learning easily interpretable models that can accurately generate a sequence based on an attribute vector. To this end, we propose to learn a sparse event-flow graph over the training sequences, and statistically robust rules that use meta data to determine which paths to follow. We formalize the problem in terms of the Minimum Description Length (MDL) principle, by which we identify the best model as the one that compresses the data best. As the resulting optimization problem is NP-hard, we propose the efficient CONSEQUENCE algorithm to discover good event-flow graphs from data. Through an extensive set of experiments including a case study, we show that it ably discovers compact, interpretable and accurate models for the generation and prediction of event sequences from data, has a low sample complexity, and is particularly robust against noise.

PDF Details

NeurIPS Conference 2022 Conference Paper

Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent

Sebastian Dalleiger
Jilles Vreeken

Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we propose to relax BMF continuously using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that it converges quickly, recovers the ground truth precisely, and estimates the simulated rank exactly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful.

PDF Details

ICML Conference 2022 Conference Paper

Inferring Cause and Effect in the Presence of Heteroscedastic Noise

Sascha Xu
Osman Mian
Alexander Marx 0001
Jilles Vreeken

We study the problem of identifying cause and effect over two univariate continuous variables $X$ and $Y$ from a sample of their joint distribution. Our focus lies on the setting when the variance of the noise may be dependent on the cause. We propose to partition the domain of the cause into multiple segments where the noise indeed is dependent. To this end, we minimize a scale-invariant, penalized regression score, finding the optimal partitioning using dynamic programming. We show under which conditions this allows us to identify the causal direction for the linear setting with heteroscedastic noise, for the non-linear setting with homoscedastic noise, as well as empirically confirm that these results generalize to the non-linear and heteroscedastic case. Altogether, the ability to model heteroscedasticity translates into an improved performance in telling cause from effect on a wide range of synthetic and real-world datasets.

Details

ICML Conference 2022 Conference Paper

Label-Descriptive Patterns and Their Application to Characterizing Classification Errors

Michael A. Hedderich
Jonas Fischer
Dietrich Klakow
Jilles Vreeken

State-of-the-art deep learning methods achieve human-like performance on many tasks, but make errors nevertheless. Characterizing these errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors, but also gives a way to act and improve the classifier. We propose to discover those feature-value combinations (i. e. , patterns) that strongly correlate with correct resp. erroneous predictions to obtain a global and interpretable description for arbitrary classifiers. We show this is an instance of the more general label description problem, which we formulate in terms of the Minimum Description Length principle. To discover a good pattern set, we develop the efficient Premise algorithm. Through an extensive set of experiments we show it performs very well in practice on both synthetic and real-world data. Unlike existing solutions, it ably recovers ground truth patterns, even on highly imbalanced data over many features. Through two case studies on Visual Question Answering and Named Entity Recognition, we confirm that Premise gives clear and actionable insight into the systematic errors made by modern NLP classifiers.

Details

AAAI Conference 2022 Conference Paper

Naming the Most Anomalous Cluster in Hilbert Space for Structures with Attribute Information

Janis Kalofolias
Jilles Vreeken

We consider datasets consisting of arbitrarily structured entities (e. g. , molecules, sequences, graphs, etc) whose similarity can be assessed with a reproducing kernel (or a family thereof). These entities are assumed to additionally have a set of named attributes (e. g. : number_of_atoms, stock_price, etc). These attributes can be used to classify the structured entities in discrete sets (e. g. , ‘number_of_atoms < 3’, ‘stock_price ≤ 100’, etc) and can effectively serve as Boolean predicates. Our goal is to use this side-information to provide named kernel-based anomaly detection. To this end, we propose a method which is able to find among all possible entity subsets that can be described as a conjunction of the available predicates either a) the optimal cluster within the Reproducing Kernel Hilbert Space, or b) the most anomalous subset within the same space. Our method employs combinatorial optimisation of an adaptation of the Maximum-Mean-Discrepancy measure that captures the above intuition. Additionally, we propose a criterion to select the optimal one out of a family of kernels in a way that preserves the available side-information. Finally, we provide several real world datasets that demonstrate the usefulness of our proposed method.

PDF Details

AAAI Conference 2021 Conference Paper

Discovering Fully Oriented Causal Networks

Osman A Mian
Alexander Marx
Jilles Vreeken

We study the problem of inferring causal graphs from observational data. We are particularly interested in discovering graphs where all edges are oriented, as opposed to the partially directed graph that the state-of-the-art discover. To this end we base our approach on the algorithmic Markov condition. Unlike the statistical Markov condition, it uniquely identifies the true causal network as the one that provides the simplest— as measured in Kolmogorov complexity—factorization of the joint distribution. Although Kolmogorov complexity is not computable, we can approximate it from above via the Minimum Description Length principle, which allows us to define a consistent and computable score based on non-parametric multivariate regression. To efficiently discover causal networks in practice, we introduce the GLOBE algorithm, which greedily adds, removes, and orients edges such that it minimizes the overall cost. Through an extensive set of experiments we show GLOBE performs very well in practice, beating the state-ofthe-art by a margin.

PDF Details

ICML Conference 2021 Conference Paper

What's in the Box? Exploring the Inner Life of Neural Networks with Robust Rules

Jonas Fischer
Anna Oláh
Jilles Vreeken

We propose a novel method for exploring how neurons within neural networks interact. In particular, we consider activation values of a network for given data, and propose to mine noise-robust rules of the form X {\rightarrow} Y, where X and Y are sets of neurons in different layers. We identify the best set of rules by the Minimum Description Length Principle as the rules that together are most descriptive of the activation data. To learn good rule sets in practice, we propose the unsupervised ExplaiNN algorithm. Extensive evaluation shows that the patterns it discovers give clear insight in how networks perceive the world: they identify shared, respectively class-specific traits, compositionality within the network, as well as locality in convolutional layers. Moreover, these patterns are not only easily interpretable, but also supercharge prototyping as they identify which groups of neurons to consider in unison.

Details

AAAI Conference 2020 Conference Paper

Explainable Data Decompositions

Sebastian Dalleiger
Jilles Vreeken

Our goal is to discover the components of a dataset, characterize why we deem these components, explain how these components are different from each other, as well as identify what properties they share among each other. As is usual, we consider regions in the data to be components if they show signiﬁcantly different distributions. What is not usual, however, is that we parameterize these distributions with patterns that are informative for one or more components. We do so because these patterns allow us to characterize what is going on in our data as well as explain our decomposition. We deﬁne the problem in terms of a regularized maximum likelihood, in which we use the Maximum Entropy principle to model each data component with a set of patterns. As the search space is large and unstructured, we propose the deterministic DISC algorithm to efﬁciently discover high-quality decompositions via an alternating optimization approach. Empirical evaluation on synthetic and real-world data shows that DISC efﬁciently discovers meaningful components and accurately characterises these in easily understandable terms.

PDF Details

IJCAI Conference 2019 Conference Paper

Discovering Reliable Dependencies from Data: Hardness and Improved Algorithms

Panagiotis Mandros
Mario Boley
Jilles Vreeken

The reliable fraction of information is an attractive score for quantifying (functional) dependencies in high-dimensional data. In this paper, we systematically explore the algorithmic implications of using this measure for optimization. We show that the problem is NP-hard, justifying worst-case exponential-time as well as heuristic search methods. We then substantially improve the practical performance for both optimization styles by deriving a novel admissible bounding function that has an unbounded potential for additional pruning over the previously proposed one. Finally, we empirically investigate the approximation ratio of the greedy algorithm and show that it produces highly competitive results in a fraction of time needed for complete branch-and-bound style search.

PDF Details

ICML Conference 2014 Conference Paper

Multivariate Maximal Correlation Analysis

Hoang Vu Nguyen
Emmanuel Müller
Jilles Vreeken
Pavel Efros
Klemens Böhm

Correlation analysis is one of the key elements of statistics, and has various applications in data analysis. Whereas most existing measures can only detect pairwise correlations between two dimensions, modern analysis aims at detecting correlations in multi-dimensional spaces. We propose MAC, a novel multivariate correlation measure designed for discovering multi-dimensional patterns. It belongs to the powerful class of maximal correlation analysis, for which we propose a generalization to multivariate domains. We highlight the limitations of current methods in this class, and address these with MAC. Our experiments show that MAC outperforms existing solutions, is robust to noise, and discovers interesting and useful patterns.

Details