Author name cluster

Sara Magliacane

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers

2 author rows

TMLR Journal 2026 Journal Article

Improving Local Explainability By Learning Causal Graphs From Data

Daan Roos
Sebastian Gerwinn
Jan-Willem van de Meent
Sara Magliacane

Causal Shapley values take into account causal relations among dependent features to adjust the contributions of each feature to a prediction. A limitation of this approach is that it can only leverage known causal relations. In this work we combine the computation of causal Shapley values with causal discovery, i.e., learning causal graphs from data. In particular, we compute causal explanations across the Markov Equivalence Class (MEC), a set of candidate causal graphs learned from observational data, providing a list of causal Shapley values that explain the prediction. We propose two methods for estimating this list efficiently, drawing on the equivalences of the interventional distributions for a subset of the causal graphs. We evaluate our methods on synthetic and real-world data, showing that they provide explanations that are more consistent with the true causal effects compared to traditional Shapley value approaches that disregard causal relations. Our results show that even when the Markov Equivalence Class is learned incorrectly, in most settings the explanations of our framework are on average closer to true causal Shapley values than marginal and conditional Shapley values.

PDF Details

NeurIPS Conference 2025 Conference Paper

Learning Interactive World Model for Object-Centric Reinforcement Learning

Fan Feng
Phillip Lippe
Sara Magliacane

Agents that understand objects and their interactions can learn policies that are more robust and transferable. However, most object-centric RL methods factor state by individual objects while leaving interactions implicit. We introduce the Factored Interactive Object-Centric World Model (FIOC-WM), a unified framework that learns structured representations of both objects and their interactions within a world model. FIOC-WM captures environment dynamics with disentangled and modular representations of object interactions, improving sample efficiency and generalization for policy learning. Concretely, FIOC-WM first learns object-centric latents and an interaction structure directly from pixels, leveraging pre-trained vision encoders. The learned world model then decomposes tasks into composable interaction primitives, and a hierarchical policy is trained on top: a high level selects the type and order of interactions, while a low level executes them. On simulated robotic and embodied-AI benchmarks, FIOC-WM improves policy-learning sample efficiency and generalization over world-model baselines, indicating that explicit, modular interaction learning is crucial for robust control.

PDF Details

NeurIPS Conference 2025 Conference Paper

Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions

Hidde Fokkema
Tim van Erven
Sara Magliacane

Machine learning is a vital part of many real-world systems, but several concerns remain about the lack of interpretability, explainability and robustness of black-box AI systems. Concept Bottleneck Models (CBM) address some of these challenges by learning interpretable concepts from high-dimensional data, e. g. images, which are used to predict labels. An important issue in CBMs are spurious correlation between concepts, which effectively lead to learning “wrong” concepts. Current mitigating strategies have strong assumptions, e. g. , they assume that the concepts are statistically independent of each other, or require substantial interaction in terms of both interventions and labels provided by annotators. In this paper, we describe a framework that provides theoretical guarantees on the correctness of the learned concepts and on the number of required labels, without requiring any interventions. Our framework leverages causal representation learning (CRL) methods to learn latent causal variables from high-dimensional observations in a unsupervised way, and then learns to align these variables with interpretable concepts with few concept labels. We propose a linear and a non-parametric estimator for this mapping, providing a finite-sample high probability result in the linear case and an asymptotic consistency result for the non-parametric estimator. We evaluate our framework in synthetic and image benchmarks, showing that the learned concepts have less impurities and are often more accurate than other CBMs, even in settings with strong correlations between concepts.

PDF Details

ICML Conference 2024 Conference Paper

A Sparsity Principle for Partially Observable Causal Representation Learning

Danru Xu
Dingling Yao
Sébastien Lachapelle
Perouz Taslakian
Julius von Kügelgen
Francesco Locatello
Sara Magliacane

Causal representation learning aims at identifying high-level causal variables from perceptual data. Most methods assume that all latent causal variables are captured in the high-dimensional observations. We instead consider a partially observed setting, in which each measurement only provides information about a subset of the underlying causal state. Prior work has studied this setting with multiple domains or views, each depending on a fixed subset of latents. Here, we focus on learning from unpaired observations from a dataset with an instance-dependent partial observability pattern. Our main contribution is to establish two identifiability results for this setting: one for linear mixing functions without parametric assumptions on the underlying causal model, and one for piecewise linear mixing functions with Gaussian latent causal variables. Based on these insights, we propose two methods for estimating the underlying causal variables by enforcing sparsity in the inferred representation. Experiments on different simulated datasets and established benchmarks highlight the effectiveness of our approach in recovering the ground-truth latents.

Details

ICML Conference 2024 Conference Paper

Amortized Equation Discovery in Hybrid Dynamical Systems

Yongtuo Liu
Sara Magliacane
Miltiadis Kofinas
Stratis Gavves

Hybrid dynamical systems are prevalent in science and engineering to express complex systems with continuous and discrete states. To learn laws of systems, all previous methods for equation discovery in hybrid systems follow a two-stage paradigm, i. e. they first group time series into small cluster fragments and then discover equations in each fragment separately through methods in non-hybrid systems. Although effective, performance is then limited because these methods ignore the commonalities in the shared dynamics of fragments that are driven by the same equations. Besides, the two-stage paradigm breaks the interdependence between categorizing and representing dynamics that jointly form hybrid systems. In this paper, we reformulate the problem and propose an end-to-end learning framework, i. e. Amortized Equation Discovery (AMORE), to jointly categorize modes and discover equations characterizing motion dynamics of each mode by all segments of the mode. Experiments on four hybrid and six non-hybrid systems demonstrate the superior performance of our method against previous methods on equation discovery, segmentation, and forecasting.

Details

UAI Conference 2024 Conference Paper

Learning Causal Abstractions of Linear Structural Causal Models

Riccardo Massidda
Sara Magliacane
Davide Bacciu

The need for modelling causal knowledge at different levels of granularity arises in several settings. Causal Abstraction provides a framework for formalizing this problem by relating two Structural Causal Models at different levels of detail. Despite increasing interest in applying causal abstraction, e. g. in the interpretability of large machine learning models, the graphical and parametrical conditions under which a causal model can abstract another are not known. Furthermore, learning causal abstractions from data is still an open problem. In this work, we tackle both issues for linear causal models with linear abstraction functions. First, we characterize how the low-level coefficients and the abstraction function determine the high-level coefficients and how the high-level model constrains the causal ordering of low-level variables. Then, we apply our theoretical results to learn high-level and low-level causal models and their abstraction function from observational data. In particular, we introduce Abs-LiNGAM, a method that leverages the constraints induced by the learned high-level model and the abstraction function to speedup the recovery of the larger low-level model, under the assumption of non-Gaussian noise terms. In simulated settings, we show the effectiveness of learning causal abstractions from data and the potential of our method in improving scalability of causal discovery.

Details

ICLR Conference 2024 Conference Paper

Multi-View Causal Representation Learning with Partial Observability

Dingling Yao
Danru Xu
Sébastien Lachapelle
Sara Magliacane
Perouz Taslakian
Georg Martius
Julius von Kügelgen
Francesco Locatello

We present a unified framework for studying the identifiability of representations learned from simultaneously observed views, such as different data modalities. We allow a partially observed setting in which each view constitutes a nonlinear mixture of a subset of underlying latent variables, which can be causally related. We prove that the information shared across all subsets of any number of views can be learned up to a smooth bijection using contrastive learning and a single encoder per view. We also provide graphical criteria indicating which latent variables can be identified through a simple set of rules, which we refer to as identifiability algebra. Our general framework and theoretical results unify and extend several previous work on multi-view nonlinear ICA, disentanglement, and causal representation learning. We experimentally validate our claims on numerical, image, and multi-modal data sets. Further, we demonstrate that the performance of prior methods is recovered in different special cases of our setup. Overall, we find that access to multiple partial views offers unique opportunities for identifiable representation learning, enabling the discovery of latent structures from purely observational data.

Details

UAI Conference 2023 Conference Paper

BISCUIT: Causal Representation Learning from Binary Interactions

Phillip Lippe
Sara Magliacane
Sindy Löwe
Yuki M. Asano
Taco Cohen
Efstratios Gavves

Identifying the causal variables of an environment and how to intervene on them is of core value in applications such as robotics and embodied AI. While an agent can commonly interact with the environment and may implicitly perturb the behavior of some of these causal variables, often the targets it affects remain unknown. In this paper, we show that causal variables can still be identified for many common setups, e. g. , additive Gaussian noise models, if the agent’s interactions with a causal variable can be described by an unknown binary variable. This happens when each causal variable has two different mechanisms, e. g. , an observational and an interventional one. Using this identifiability result, we propose BISCUIT, a method for simultaneously learning causal variables and their corresponding binary interaction variables. On three robotic-inspired datasets, BISCUIT accurately identifies causal variables and can even be scaled to complex, realistic environments for embodied AI.

Details

ICLR Conference 2023 Conference Paper

Causal Representation Learning for Instantaneous and Temporal Effects in Interactive Systems

Phillip Lippe
Sara Magliacane
Sindy Löwe
Yuki M. Asano
Taco Cohen
Efstratios Gavves

Causal representation learning is the task of identifying the underlying causal variables and their relations from high-dimensional observations, such as images. Recent work has shown that one can reconstruct the causal variables from temporal sequences of observations under the assumption that there are no instantaneous causal relations between them. In practical applications, however, our measurement or frame rate might be slower than many of the causal effects. This effectively creates ``instantaneous'' effects and invalidates previous identifiability results. To address this issue, we propose iCITRIS, a causal representation learning method that allows for instantaneous effects in intervened temporal sequences when intervention targets can be observed, e.g., as actions of an agent. iCITRIS identifies the potentially multidimensional causal variables from temporal observations, while simultaneously using a differentiable causal discovery method to learn their causal graph. In experiments on three datasets of interactive systems, iCITRIS accurately identifies the causal variables and their causal graph.

Details

ICML Conference 2023 Conference Paper

Graph Switching Dynamical Systems

Yongtuo Liu
Sara Magliacane
Miltiadis Kofinas
Efstratios Gavves

Dynamical systems with complex behaviours, e. g. immune system cells interacting with a pathogen, are commonly modelled by splitting the behaviour in different regimes, or modes, each with simpler dynamics, and then learn the switching behaviour from one mode to another. To achieve this, Switching Dynamical Systems (SDS) are a powerful tool that automatically discovers these modes and mode-switching behaviour from time series data. While effective, these methods focus on independent objects, where the modes of one object are independent of the modes of the other objects. In this paper, we focus on the more general interacting object setting for switching dynamical systems, where the per-object dynamics also depend on an unknown and dynamically changing subset of other objects and their modes. To this end, we propose a novel graph-based approach for switching dynamical systems, GRAph Switching dynamical Systems (GRASS), in which we use a dynamic graph to characterize interactions between objects and learn both intra-object and inter-object mode-switching behaviour. For benchmarking, we create two new datasets, a synthesized ODE-driven particles dataset and a real-world Salsa-couple dancing dataset. Experiments show that GRASS can consistently outperforms previous state-of-the-art methods. We will release code and data after acceptance.

Details

NeurIPS Conference 2023 Conference Paper

Learning Dynamic Attribute-factored World Models for Efficient Multi-object Reinforcement Learning

Fan Feng
Sara Magliacane

In many reinforcement learning tasks, the agent has to learn to interact with many objects of different types and generalize to unseen combinations and numbers of objects. Often a task is a composition of previously learned tasks (e. g. block stacking). These are examples of compositional generalization, in which we compose object-centric representations to solve complex tasks. Recent works have shown the benefits of object-factored representations and hierarchical abstractions for improving sample efficiency in these settings. On the other hand, these methods do not fully exploit the benefits of factorization in terms of object attributes. In this paper, we address this opportunity and introduce the Dynamic Attribute FacTored RL (DAFT-RL) framework. In DAFT-RL, we leverage object-centric representation learning to extract objects from visual inputs. We learn to classify them into classes and infer their latent parameters. For each class of object, we learn a class template graph that describes how the dynamics and reward of an object of this class factorize according to its attributes. We also learn an interaction pattern graph that describes how objects of different classes interact with each other at the attribute level. Through these graphs and a dynamic interaction graph that models the interactions between objects, we can learn a policy that can then be directly applied in a new environment by estimating the interactions and latent parameters. We evaluate DAFT-RL in three benchmark datasets and show our framework outperforms the state-of-the-art in generalizing across unseen objects with varying attributes and latent parameters, as well as in the composition of previously learned tasks.

PDF Details

NeurIPS Conference 2023 Conference Paper

Modulated Neural ODEs

Ilze Amanda Auzina
Çağatay Yıldız
Sara Magliacane
Matthias Bethge
Efstratios Gavves

Neural ordinary differential equations (NODEs) have been proven useful for learning non-linear dynamics of arbitrary trajectories. However, current NODE methods capture variations across trajectories only via the initial state value or by auto-regressive encoder updates. In this work, we introduce Modulated Neural ODEs (MoNODEs), a novel framework that sets apart dynamics states from underlying static factors of variation and improves the existing NODE methods. In particular, we introduce *time-invariant modulator variables* that are learned from the data. We incorporate our proposed framework into four existing NODE variants. We test MoNODE on oscillating systems, videos and human walking trajectories, where each trajectory has trajectory-specific modulation. Our framework consistently improves the existing model ability to generalize to new dynamic parameterizations and to perform far-horizon forecasting. In addition, we verify that the proposed modulator variables are informative of the true unknown factors of variation as measured by $R^2$ scores.

PDF Details

ICLR Conference 2022 Conference Paper

AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning

Biwei Huang
Fan Feng
Chaochao Lu
Sara Magliacane
Kun Zhang 0001

One practical challenge in reinforcement learning (RL) is how to make quick adaptations when faced with new environments. In this paper, we propose a principled framework for adaptive RL, called AdaRL, that adapts reliably and efficiently to changes across domains with a few samples from the target domain, even in partially observable environments. Specifically, we leverage a parsimonious graphical representation that characterizes structural relationships over variables in the RL system. Such graphical representations provide a compact way to encode what and where the changes across domains are, and furthermore inform us with a minimal set of changes that one has to consider for the purpose of policy adaptation. We show that by explicitly leveraging this compact representation to encode changes, we can efficiently adapt the policy to the target domain, in which only a few samples are needed and further policy optimization is avoided. We illustrate the efficacy of AdaRL through a series of experiments that vary factors in the observation, transition and reward functions for Cartpole and Atari games.

Details

ICML Conference 2022 Conference Paper

CITRIS: Causal Identifiability from Temporal Intervened Sequences

Phillip Lippe
Sara Magliacane
Sindy Löwe
Yuki M. Asano
Taco Cohen
Stratis Gavves

Understanding the latent causal factors of a dynamical system from visual observations is considered a crucial step towards agents reasoning in complex environments. In this paper, we propose CITRIS, a variational autoencoder framework that learns causal representations from temporal sequences of images in which underlying causal factors have possibly been intervened upon. In contrast to the recent literature, CITRIS exploits temporality and observing intervention targets to identify scalar and multidimensional causal factors, such as 3D rotation angles. Furthermore, by introducing a normalizing flow, CITRIS can be easily extended to leverage and disentangle representations obtained by already pretrained autoencoders. Extending previous results on scalar causal factors, we prove identifiability in a more general setting, in which only some components of a causal factor are affected by interventions. In experiments on 3D rendered image sequences, CITRIS outperforms previous methods on recovering the underlying causal variables. Moreover, using pretrained autoencoders, CITRIS can even generalize to unseen instantiations of causal factors, opening future research areas in sim-to-real generalization for causal representation learning.

Details

NeurIPS Conference 2022 Conference Paper

Factored Adaptation for Non-Stationary Reinforcement Learning

Fan Feng
Biwei Huang
Kun Zhang
Sara Magliacane

Dealing with non-stationarity in environments (e. g. , in the transition dynamics) and objectives (e. g. , in the reward functions) is a challenging problem that is crucial in real-world applications of reinforcement learning (RL). While most current approaches model the changes as a single shared embedding vector, we leverage insights from the recent causality literature to model non-stationarity in terms of individual latent change factors, and causal graphs across different environments. In particular, we propose Factored Adaptation for Non-Stationary RL (FANS-RL), a factored adaption approach that learns jointly both the causal structure in terms of a factored MDP, and a factored representation of the individual time-varying change factors. We prove that under standard assumptions, we can completely recover the causal graph representing the factored transition and reward function, as well as a partial structure between the individual change factors and the state components. Through our general framework, we can consider general non-stationary scenarios with different function types and changing frequency, including changes across episodes and within episodes. Experimental results demonstrate that FANS-RL outperforms existing approaches in terms of return, compactness of the latent state representation, and robustness to varying degrees of non-stationarity.

PDF Details

NeurIPS Conference 2020 Conference Paper

Active Structure Learning of Causal DAGs via Directed Clique Trees

Chandler Squires
Sara Magliacane
Kristjan Greenewald
Dmitriy Katz
Murat Kocaoglu
Karthikeyan Shanmugam

A growing body of work has begun to study intervention design for efficient structure learning of causal directed acyclic graphs (DAGs). A typical setting is a \emph{causally sufficient} setting, i. e. a system with no latent confounders, selection bias, or feedback, when the essential graph of the observational equivalence class (EC) is given as an input and interventions are assumed to be noiseless. Most existing works focus on \textit{worst-case} or \textit{average-case} lower bounds for the number of interventions required to orient a DAG. These worst-case lower bounds only establish that the largest clique in the essential graph \textit{could} make it difficult to learn the true DAG. In this work, we develop a \textit{universal} lower bound for single-node interventions that establishes that the largest clique is \textit{always} a fundamental impediment to structure learning. Specifically, we present a decomposition of a DAG into independently orientable components through \emph{directed clique trees} and use it to prove that the number of single-node interventions necessary to orient any DAG in an EC is at least the sum of half the size of the largest cliques in each chain component of the essential graph. Moreover, we present a two-phase intervention design algorithm that, under certain conditions on the chordal skeleton, matches the optimal number of interventions up to a multiplicative logarithmic factor in the number of maximal cliques. We show via synthetic experiments that our algorithm can scale to much larger graphs than most of the related work and achieves better worst-case performance than other scalable approaches. A code base to recreate these results can be found at \url{https: //github. com/csquires/dct-policy}.

PDF Details

JMLR Journal 2020 Journal Article

Joint Causal Inference from Multiple Contexts

Joris M. Mooij
Sara Magliacane
Tom Claassen

The gold standard for discovering causal relations is by means of experimentation. Over the last decades, alternative methods have been proposed that can infer causal relations between variables from certain statistical patterns in purely observational data. We introduce Joint Causal Inference (JCI), a novel approach to causal discovery from multiple data sets from different contexts that elegantly unifies both approaches. JCI is a causal modeling framework rather than a specific algorithm, and it can be implemented using any causal discovery algorithm that can take into account certain background knowledge. JCI can deal with different types of interventions (e.g., perfect, imperfect, stochastic, etc.) in a unified fashion, and does not require knowledge of intervention targets or types in case of interventional data. We explain how several well-known causal discovery algorithms can be seen as addressing special cases of the JCI framework, and we also propose novel implementations that extend existing causal discovery methods for purely observational data to the JCI setting. We evaluate different JCI implementations on synthetic data and on flow cytometry protein expression data and conclude that JCI implementations can considerably outperform state-of-the-art causal discovery algorithms. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2020. ( edit, beta )

PDF Details

NeurIPS Conference 2019 Conference Paper

Sample Efficient Active Learning of Causal Trees

Kristjan Greenewald
Dmitriy Katz
Karthikeyan Shanmugam
Sara Magliacane
Murat Kocaoglu
Enric Boix Adsera
Guy Bresler

We consider the problem of experimental design for learning causal graphs that have a tree structure. We propose an adaptive framework that determines the next intervention based on a Bayesian prior updated with the outcomes of previous experiments, focusing on the setting where observational data is cheap (assumed infinite) and interventional data is expensive. While information greedy approaches are popular in active learning, we show that in this setting they can be exponentially suboptimal (in the number of interventions required), and instead propose an algorithm that exploits graph structure in the form of a centrality measure. If infinite interventional data is available, we show that the algorithm requires a number of interventions less than or equal to a factor of 2 times the minimum achievable number. We show that the algorithm and the associated theory can be adapted to the setting where each performed intervention yields finitely many samples. Several extensions are also presented, to the case where a specified set of nodes cannot be intervened on, to the case where $K$ interventions are scheduled at once, and to the fully adaptive case where each experiment yields only one sample. In the case of finite interventional data, through simulated experiments we show that our algorithms outperform different adaptive baseline algorithms.

PDF Details

UAI Conference 2018 Conference Paper

Causal Discovery in the Presence of Measurement Error

Tineke Blom
Anna Klimovskaia
Sara Magliacane
Joris M. Mooij

Causal discovery algorithms infer causal relations from data based on several assumptions, including notably the absence of measurement error. However, this assumption is most likely violated in practical applications, which may result in erroneous, irreproducible results. In this work we show how to obtain an upper bound for the variance of random measurement error from the covariance matrix of measured variables and how to use this upper bound as a correction for constraint-based causal discovery. We demonstrate a practical application of our approach on both simulated data and real-world protein signaling data.

Details

NeurIPS Conference 2018 Conference Paper

Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions

Sara Magliacane
Thijs van Ommen
Tom Claassen
Stephan Bongers
Philip Versteeg
Joris Mooij

An important goal common to domain adaptation and causal inference is to make accurate predictions when the distributions for the source (or training) domain(s) and target (or test) domain(s) differ. In many cases, these different distributions can be modeled as different contexts of a single underlying system, in which each distribution corresponds to a different perturbation of the system, or in causal terms, an intervention. We focus on a class of such causal domain adaptation problems, where data for one or more source domains are given, and the task is to predict the distribution of a certain target variable from measurements of other variables in one or more target domains. We propose an approach for solving these problems that exploits causal inference and does not rely on prior knowledge of the causal graph, the type of interventions or the intervention targets. We demonstrate our approach by evaluating a possible implementation on simulated and real world data.

PDF Details

NeurIPS Conference 2016 Conference Paper

Ancestral Causal Inference

Sara Magliacane
Tom Claassen
Joris Mooij

Constraint-based causal discovery from limited data is a notoriously difficult challenge due to the many borderline independence test decisions. Several approaches to improve the reliability of the predictions by exploiting redundancy in the independence information have been proposed recently. Though promising, existing approaches can still be greatly improved in terms of accuracy and scalability. We present a novel method that reduces the combinatorial explosion of the search space by using a more coarse-grained representation of causal information, drastically reducing computation time. Additionally, we propose a method to score causal predictions based on their confidence. Crucially, our implementation also allows one to easily combine observational and interventional data and to incorporate various types of available background knowledge. We prove soundness and asymptotic consistency of our method and demonstrate that it can outperform the state-of-the-art on synthetic data, achieving a speedup of several orders of magnitude. We illustrate its practical feasibility by applying it on a challenging protein data set.

PDF Details