Arrow Research search

Author name cluster

Eric Gaussier

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
1 author row

Possible papers

14

TMLR Journal 2026 Journal Article

GIFT: A Framework Towards Global Interpretable Faithful Textual Explanations of Vision Classifiers

  • Eloi Zablocki
  • Valentin Gerard
  • Amaia Cardiel
  • Eric Gaussier
  • Matthieu Cord
  • Eduardo Valle

Understanding the decision processes of deep vision models is essential for their safe and trustworthy deployment in real-world settings. Existing explainability approaches, such as saliency maps or concept-based analyses, often suffer from limited faithfulness, local scope, or ambiguous semantics. We introduce GIFT, a post-hoc framework that aims to derive Global, Interpretable, Faithful, and Textual explanations for vision classifiers. GIFT begins by generating a large set of faithful, local visual counterfactuals, then employs vision–language models to translate these counterfactuals into natural-language descriptions of visual changes. These local explanations are aggregated by a large language model into concise, human-readable hypotheses about the model’s global decision rules. Crucially, GIFT includes a verification stage that quantitatively assesses the causal effect of each proposed explanation by performing image-based interventions, ensuring that the final textual explanations remain faithful to the model’s true reasoning process. Across diverse datasets, including the synthetic CLEVR benchmark, the real-world CelebA faces, and the complex BDD driving scenes, GIFT reveals not only meaningful classification rules but also unexpected biases and latent concepts driving model behavior. Altogether, GIFT bridges the gap between local counterfactual reasoning and global interpretability, offering a principled and extensible approach to causally grounded textual explanations for vision models.

NeurIPS Conference 2025 Conference Paper

Relaxing partition admissibility in Cluster-DAGs: a causal calculus with arbitrary variable clustering

  • Clément Yvernes
  • Emilie Devijver
  • Adèle Ribeiro
  • Marianne Clausel
  • Eric Gaussier

Cluster DAGs (C-DAGs) provide an abstraction of causal graphs in which nodes represent clusters of variables, and edges encode both cluster-level causal relationships and dependencies arisen from unobserved confounding. C-DAGs define an equivalence class of acyclic causal graphs that agree on cluster-level relationships, enabling causal reasoning at a higher level of abstraction. However, when the chosen clustering induces cycles in the resulting C-DAG, the partition is deemed inadmissible under conventional C-DAG semantics. In this work, we extend the C-DAG framework to support arbitrary variable clusterings by relaxing the partition admissibility constraint, thereby allowing cyclic C-DAG representations. We extend the notions of d-separation and causal calculus to this setting, significantly broadening the scope of causal reasoning across clusters and enabling the application of C-DAGs in previously intractable scenarios. Our calculus is both sound and atomically complete with respect to the do-calculus: all valid interventional queries at the cluster level can be derived using our rules, each corresponding to a primitive do-calculus step.

TMLR Journal 2024 Journal Article

Causal Discovery from Time Series with Hybrids of Constraint-Based and Noise-Based Algorithms

  • Daria Bystrova
  • Charles K. Assaad
  • Julyan Arbel
  • Emilie Devijver
  • Eric Gaussier
  • Wilfried Thuiller

Constraint-based methods and noise-based methods are two distinct families of methods proposed for uncovering causal graphs from observational data. However, both operate under strong assumptions that may be challenging to validate or could be violated in real-world scenarios. In response to these challenges, there is a growing interest in hybrid methods that amalgamate principles from both methods, showing robustness to assumption violations. This paper introduces a novel comprehensive framework for hybridizing constraint-based and noise-based methods designed to uncover causal graphs from observational time series. The framework is structured into two classes. The first class employs a noise-based strategy to identify a super graph, containing the true graph, followed by a constraint-based strategy to eliminate unnecessary edges. In the second class, a constraint-based strategy is applied to identify a skeleton, which is then oriented using a noise-based strategy. The paper provides theoretical guarantees for each class under the condition that all assumptions are satisfied, and it outlines some properties when assumptions are violated. To validate the efficacy of the framework, two algorithms from each class are experimentally tested on simulated data, realistic ecological data, and real datasets sourced from diverse applications. Notably, two novel datasets related to Information Technology monitoring are introduced within the set of considered real datasets. The experimental results underscore the robustness and effectiveness of the hybrid approaches across a broad spectrum of datasets.

IJCAI Conference 2023 Conference Paper

Survey and Evaluation of Causal Discovery Methods for Time Series (Extended Abstract)

  • Charles K. Assaad
  • Emilie Devijver
  • Eric Gaussier

We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred to as causal discovery in time series. To do so, after a description of the underlying concepts and modelling assumptions, we present different methods according to the family of approaches they belong to: Granger causality, constraint-based approaches, noise-based approaches, score-based approaches, logic-based approaches, topology-based approaches, and difference-based approaches. We then evaluate several representative methods to illustrate the behaviour of different families of approaches. This illustration is conducted on both artificial and real datasets, with different characteristics. The main conclusions one can draw from this survey is that causal discovery in times series is an active research field in which new methods (in every family of approaches) are regularly proposed, and that no family or method stands out in all situations. Indeed, they all rely on assumptions that may or may not be appropriate for a particular dataset.

AAAI Conference 2022 Conference Paper

Listwise Learning to Rank Based on Approximate Rank Indicators

  • Thibaut Thonet
  • Yagmur Gizem Cinar
  • Eric Gaussier
  • Minghan Li
  • Jean-Michel Renders

We study here a way to approximate information retrieval metrics through a softmax-based approximation of the rank indicator function. Indeed, this latter function is a key component in the design of information retrieval metrics, as well as in the design of the ranking and sorting functions. Obtaining a good approximation for it thus opens the door to differentiable approximations of many evaluation measures that can in turn be used in neural end-to-end approaches. We first prove theoretically that the approximations proposed are of good quality, prior to validate them experimentally on both learning to rank and text-based information retrieval tasks.

JAIR Journal 2022 Journal Article

Survey and Evaluation of Causal Discovery Methods for Time Series

  • Charles K. Assaad
  • Emilie Devijver
  • Eric Gaussier

We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred to as causal discovery in time series. To do so, after a description of the underlying concepts and modelling assumptions, we present different methods according to the family of approaches they belong to: Granger causality, constraint-based approaches, noise-based approaches, score-based approaches, logic-based approaches, topology-based approaches, and difference-based approaches. We then evaluate several representative methods to illustrate the behaviour of different families of approaches. This illustration is conducted on both artificial and real datasets, with different characteristics. The main conclusions one can draw from this survey is that causal discovery in times series is an active research field in which new methods (in every family of approaches) are regularly proposed, and that no family or method stands out in all situations. Indeed, they all rely on assumptions that may or may not be appropriate for a particular dataset.

NeurIPS Conference 2020 Conference Paper

Heavy-tailed Representations, Text Polarity Classification & Data Augmentation

  • Hamid Jalalzai
  • Pierre Colombo
  • Chloé Clavel
  • Eric Gaussier
  • Giovanna Varni
  • Emmanuel Vignon
  • Anne Sabourin

The dominant approaches to text representation in natural language rely on learning embeddings on massive corpora which have convenient properties such as compositionality and distance preservation. In this paper, we develop a novel method to learn a heavy-tailed embedding with desirable regularity properties regarding the distributional tails, which allows to analyze the points far away from the distribution bulk using the framework of multivariate extreme value theory. In particular, a classifier dedicated to the tails of the proposed embedding is obtained which exhibits a scale invariance property exploited in a novel text generation method for label preserving dataset augmentation. Experiments on synthetic and real text data show the relevance of the proposed framework and confirm that this method generates meaningful sentences with controllable attribute, e. g. positive or negative sentiments.

NeurIPS Conference 2020 Conference Paper

Smooth And Consistent Probabilistic Regression Trees

  • Sami Alkhoury
  • Emilie Devijver
  • Marianne Clausel
  • Myriam Tami
  • Eric Gaussier
  • georges Oppenheim

We propose here a generalization of regression trees, referred to as Probabilistic Regression (PR) trees, that adapt to the smoothness of the prediction function relating input and output variables while preserving the interpretability of the prediction and being robust to noise. In PR trees, an observation is associated to all regions of a tree through a probability distribution that reflects how far the observation is to a region. We show that such trees are consistent, meaning that their error tends to 0 when the sample size tends to infinity, a property that has not been established for similar, previous proposals as Soft trees and Smooth Transition Regression trees. We further explain how PR trees can be used in different ensemble methods, namely Random Forests and Gradient Boosted Trees. Lastly, we assess their performance through extensive experiments that illustrate their benefits in terms of performance, interpretability and robustness to noise.

JAIR Journal 2019 Journal Article

On Inductive Abilities of Latent Factor Models for Relational Learning

  • Théo Trouillon
  • Eric Gaussier
  • Christopher R. Dance
  • Guillaume Bouchard

Latent factor models are increasingly popular for modeling multi-relational knowledge graphs. By their vectorial nature, it is not only hard to interpret why this class of models works so well, but also to understand where they fail and how they might be improved. We conduct an experimental survey of state-of-the-art models, not towards a purely comparative end, but as a means to get insight about their inductive abilities. To assess the strengths and weaknesses of each model, we create simple tasks that exhibit first, atomic properties of binary relations, and then, common inter-relational inference through synthetic genealogies. Based on these experimental results, we propose new research directions to improve on existing models.

JMLR Journal 2016 Journal Article

Learning Taxonomy Adaptation in Large-scale Classification

  • Rohit Babbar
  • Ioannis Partalas
  • Eric Gaussier
  • Massih-Reza Amini
  • Cécile Amblard

In this paper, we study flat and hierarchical classification strategies in the context of large-scale taxonomies. Addressing the problem from a learning-theoretic point of view, we first propose a multi-class, hierarchical data dependent bound on the generalization error of classifiers deployed in large-scale taxonomies. This bound provides an explanation to several empirical results reported in the literature, related to the performance of flat and hierarchical classifiers. Based on this bound, we also propose a technique for modifying a given taxonomy through pruning, that leads to a lower value of the upper bound as compared to the original taxonomy. We then present another method for hierarchy pruning by studying approximation error of a family of classifiers, and derive from it features used in a meta-classifier to decide which nodes to prune. We finally illustrate the theoretical developments through several experiments conducted on two widely used taxonomies. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

AAAI Conference 2015 Conference Paper

Improved Local Search for Binary Matrix Factorization

  • Seyed Hamid Mirisaee
  • Eric Gaussier
  • Alexandre Termier

Rank K Binary Matrix Factorization (BMF) approximates a binary matrix by the product of two binary matrices of lower rank, K, using either L1 or L2 norm. In this paper, we first show that the BMF with L2 norm can be reformulated as an Unconstrained Binary Quadratic Programming (UBQP) problem. We then review several local search strategies that can be used to improve the BMF solutions obtained by previously proposed methods, before introducing a new local search dedicated to the BMF problem. We show in particular that the proposed solution is in general faster than the previously proposed ones. We then assess its behavior on several collections and methods and show that it significantly improves methods targeting the L2 norms on all the datasets considered; for the L1 norm, the improvement is also significant for real, structured datasets and for the BMF problem without the binary reconstruction constraint.

IS Journal 2014 Journal Article

Behavior Informatics: A New Perspective

  • Longbing Cao
  • Thorsten Joachims
  • Can Wang
  • Eric Gaussier
  • Jinjiu Li
  • Yuming Ou
  • Dan Luo
  • Reza Zafarani

This installment of Trends & Controversies provides an array of perspectives on the latest research in behavior informatics. Longbing Cao introduces the work in "Behavior Informatics: A New Perspective. " Then, in "Behavior Computing, " Longbing Cao and Thorsten Joachims provide a basic overview of the topic. Next is "Coupled Behavior Representation, Modeling, Analysis, and Reasoning" by Can Wang, Longbing Cao, Eric Gaussier, Jinjiu Li, Yuming Ou, and Dan Luo. The fourth article is "Behavior Analysis in Social Media, " by Reza Zafarani and Huan Liu. The fifth article is "Group Recommendation and Behavior, " by Guandong Xu and Zhiang Wu. Gabriella Pasi wrote the sixth article, "Web Search and Behavior. " The seventh article, "Behaviors of IPTV Users, " is by Ya Zhang, Xiaokang Yang, and Hongyuan Zha. Finally, "Should Behavioral Models of Terror Groups Be Disclosed? " is by Edoardo Serra and V. S. Subrahmanian.

NeurIPS Conference 2013 Conference Paper

On Flat versus Hierarchical Classification in Large-Scale Taxonomies

  • Rohit Babbar
  • Ioannis Partalas
  • Eric Gaussier
  • Massih R. Amini

We study in this paper flat and hierarchical classification strategies in the context of large-scale taxonomies. To this end, we first propose a multiclass, hierarchical data dependent bound on the generalization error of classifiers deployed in large-scale taxonomies. This bound provides an explanation to several empirical results reported in the literature, related to the performance of flat and hierarchical classifiers. We then introduce another type of bounds targeting the approximation error of a family of classifiers, and derive from it features used in a meta-classifier to decide which nodes to prune (or flatten) in a large-scale taxonomy. We finally illustrate the theoretical developments through several experiments conducted on two widely used taxonomies.

JMLR Journal 2003 Journal Article

Word-Sequence Kernels

  • Nicola Cancedda
  • Eric Gaussier
  • Cyril Goutte
  • Jean-Michel Renders

We address the problem of categorising documents using kernel-based methods such as Support Vector Machines. Since the work of Joachims (1998), there is ample experimental evidence that SVM using the standard word frequencies as features yield state-of-the-art performance on a number of benchmark problems. Recently, Lodhi et al. (2002) proposed the use of string kernels, a novel way of computing document similarity based of matching non-consecutive subsequences of characters. In this article, we propose the use of this technique with sequences of words rather than characters. This approach has several advantages, in particular it is more efficient computationally and it ties in closely with standard linguistic pre-processing techniques. We present some extensions to sequence kernels dealing with symbol-dependent and match-dependent decay factors, and present empirical evaluations of these extensions on the Reuters-21578 datasets.