Author name cluster

Guillaume Staerman

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

TMLR Journal 2024 Journal Article

A Pseudo-Metric between Probability Distributions based on Depth-Trimmed Regions

Guillaume Staerman
Pavlo Mozharovskyi
Pierre Colombo
Stephan Clémençon
Florence d'Alché-Buc

The design of a metric between probability distributions is a longstanding problem motivated by numerous applications in machine learning. Focusing on continuous probability distributions in the Euclidean space $\mathbb{R}^d$, we introduce a novel pseudo-metric between probability distributions by leveraging the extension of univariate quantiles to multivariate spaces. Data depth is a nonparametric statistical tool that measures the centrality of any element $x\in\mathbb{R}^d$ with respect to (w.r.t.) a probability distribution or a dataset. It is a natural median-oriented extension of the cumulative distribution function (cdf) to the multivariate case. Thus, its upper-level sets---the depth-trimmed regions---give rise to a definition of multivariate quantiles. The new pseudo-metric relies on the average of the Hausdorff distance between the depth-based quantile regions for each distribution. Its good behavior regarding major transformation groups, as well as its ability to factor out translations, are depicted. Robustness, an appealing feature of this pseudo-metric, is studied through the finite sample breakdown point. Moreover, we propose an efficient approximation method with linear time complexity w.r.t. the size of the dataset and its dimension. The quality of this approximation and the performance of the proposed approach are illustrated in numerical experiments.

PDF Details

AAAI Conference 2024 Conference Paper

Unsupervised Layer-Wise Score Aggregation for Textual OOD Detection

Maxime Darrin
Guillaume Staerman
Eduardo Dadalto Camara Gomes
Jackie C. K. Cheung
Pablo Piantanida
Pierre Colombo

Out-of-distribution (OOD) detection is a rapidly growing field due to new robustness and security requirements driven by an increased number of AI-based systems. Existing OOD textual detectors often rely on anomaly scores (\textit{e.g.}, Mahalanobis distance) computed on the embedding output of the last layer of the encoder. In this work, we observe that OOD detection performance varies greatly depending on the task and layer output. More importantly, we show that the usual choice (the last layer) is rarely the best one for OOD detection and that far better results can be achieved, provided that an oracle selects the best layer. We propose a data-driven, unsupervised method to leverage this observation to combine layer-wise anomaly scores. In addition, we extend classical textual OOD benchmarks by including classification tasks with a more significant number of classes (up to 150), which reflects more realistic settings. On this augmented benchmark, we show that the proposed post-aggregation methods achieve robust and consistent results comparable to using the best layer according to an oracle while removing manual feature selection altogether.

PDF Details DOI

TMLR Journal 2023 Journal Article

A Halfspace-Mass Depth-Based Method for Adversarial Attack Detection

Marine Picot
Federica Granese
Guillaume Staerman
Marco Romanelli
Francisco Messina
Pablo Piantanida
Pierre Colombo

Despite the widespread use of deep learning algorithms, vulnerability to adversarial attacks is still an issue limiting their use in critical applications. Detecting these attacks is thus crucial to build reliable algorithms and has received increasing attention in the last few years. In this paper, we introduce the HalfspAce Mass dePth dEtectoR (HAMPER), a new method to detect adversarial examples by leveraging the concept of data depths, a statistical notion that provides center-outward ordering of points with respect to (w.r.t.) a probability distribution. In particular, the halfspace-mass (HM) depth exhibits attractive properties such as computational efficiency, which makes it a natural candidate for adversarial attack detection in high-dimensional spaces. Additionally, HM is non differentiable making it harder for attackers to directly attack HAMPER via gradient based-methods. We evaluate HAMPER in the context of supervised adversarial attacks detection across four benchmark datasets. Overall, we empirically show that HAMPER consistently outperforms SOTA methods. In particular, the gains are 13.1% (29.0%) in terms of AUROC (resp. FPR) on SVHN, 14.6% (25.7%) on CIFAR10 and 22.6% (49.0%) on CIFAR100 compared to the best performing method.

PDF Details

ICML Conference 2023 Conference Paper

FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels

Guillaume Staerman
Cédric Allain
Alexandre Gramfort
Thomas Moreau 0001

Temporal point processes (TPP) are a natural tool for modeling event-based data. Among all TPP models, Hawkes processes have proven to be the most widely used, mainly due to their adequate modeling for various applications, particularly when considering exponential or non-parametric kernels. Although non-parametric kernels are an option, such models require large datasets. While exponential kernels are more data efficient and relevant for specific applications where events immediately trigger more events, they are ill-suited for applications where latencies need to be estimated, such as in neuroscience. This work aims to offer an efficient solution to TPP inference using general parametric kernels with finite support. The developed solution consists of a fast $\ell_2$ gradient-based solver leveraging a discretized version of the events. After theoretically supporting the use of discretization, the statistical and computational efficiency of the novel approach is demonstrated through various numerical experiments. Finally, the method’s effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG). Given the use of general parametric kernels, results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.

Details

ICML Conference 2023 Conference Paper

Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability

Anass Aghbalou
Guillaume Staerman

Hypothesis transfer learning (HTL) contrasts domain adaptation by allowing for a previous task leverage, named the source, into a new one, the target, without requiring access to the source data. Indeed, HTL relies only on a hypothesis learnt from such source data, relieving the hurdle of expansive data storage and providing great practical benefits. Hence, HTL is highly beneficial for real-world applications relying on big data. The analysis of such a method from a theoretical perspective faces multiple challenges, particularly in classification tasks. This paper deals with this problem by studying the learning theory of HTL through algorithmic stability, an attractive theoretical framework for machine learning algorithms analysis. In particular, we are interested in the statistical behavior of the regularized empirical risk minimizers in the case of binary classification. Our stability analysis provides learning guarantees under mild assumptions. Consequently, we derive several complexity-free generalization bounds for essential statistical quantities like the training error, the excess risk and cross-validation estimates. These refined bounds allow understanding the benefits of transfer learning and comparing the behavior of standard losses in different scenarios, leading to valuable insights for practitioners.

Details

NeurIPS Conference 2022 Conference Paper

Beyond Mahalanobis Distance for Textual OOD Detection

Pierre Colombo
Eduardo Dadalto
Guillaume Staerman
Nathan Noiry
Pablo Piantanida

As the number of AI systems keeps growing, it is fundamental to implement and develop efficient control mechanisms to ensure the safe and proper functioning of machine learning (ML) systems. Reliable out-of-distribution (OOD) detection aims to detect test samples that are statistically far from the training distribution, as they might cause failures of in-production systems. In this paper, we propose a new detector called TRUSTED. Different from previous works, TRUSTED key components (i) include a novel OOD score relying on the concept of statistical data depth, (ii) rely on the idea’s full potential that all hidden layers of the network carry information regarding OOD. Our extensive experiments, comparing over 51k model configurations including different checkpoints, seed and various datasets, demonstrate that TRUSTED achieve state-of-the-art performances by producing an improvement of over 3 AUROC points.

PDF Details

ICML Conference 2021 Conference Paper

Generalization Bounds in the Presence of Outliers: a Median-of-Means Study

Pierre Laforgue
Guillaume Staerman
Stéphan Clémençon

In contrast to the empirical mean, the Median-of-Means (MoM) is an estimator of the mean $\theta$ of a square integrable r. v. Z, around which accurate nonasymptotic confidence bounds can be built, even when Z does not exhibit a sub-Gaussian tail behavior. Thanks to the high confidence it achieves on heavy-tailed data, MoM has found various applications in machine learning, where it is used to design training procedures that are not sensitive to atypical observations. More recently, a new line of work is now trying to characterize and leverage MoM’s ability to deal with corrupted data. In this context, the present work proposes a general study of MoM’s concentration properties under the contamination regime, that provides a clear understanding on the impact of the outlier proportion and the number of blocks chosen. The analysis is extended to (multisample) U-statistics, i. e. averages over tuples of observations, that raise additional challenges due to the dependence induced. Finally, we show that the latter bounds can be used in a straightforward fashion to derive generalization guarantees for pairwise learning in a contaminated setting, and propose an algorithm to compute provably reliable decision functions.

Details