Author name cluster

Ismail Ben Ayed

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

15 papers

2 author rows

TMLR Journal 2026 Journal Article

Are foundation models for computer vision good conformal predictors?

Leo Fillioux
Julio Silva-Rodríguez
Ismail Ben Ayed
Paul-Henry Cournède
Maria Vakalopoulou
Stergios Christodoulidis
Jose Dolz

Recent advances in self-supervision and contrastive learning have brought the performance of foundation models to unprecedented levels in a variety of tasks. Fueled by this progress, these models are becoming the prevailing approach for a wide array of real-world vision problems, including risk-sensitive and high-stakes applications. However, ensuring safe deployment in these scenarios requires a more comprehensive understanding of their uncertainty modeling capabilities, which has received little attention. In this work, we delve into the behaviour of vision and vision-language foundation models under Conformal Prediction (CP), a statistical framework that provides theoretical guarantees of marginal coverage of the true class. Across extensive experiments including popular vision classification benchmarks, well-known foundation vision models, and three CP methods, our findings reveal that foundation models are well-suited for conformalization procedures, particularly those integrating Vision Transformers. We also show that calibrating the confidence predictions of these models, a popular strategy to improve their uncertainty quantification, actually leads to efficiency degradation of the conformal set on adaptive CP methods. Furthermore, few-shot adaptation of Vision-Language Models (VLMs) to downstream tasks, whose popularity is surging, enhances conformal scores compared to zero-shot predictions. Last, our empirical study exposes APS as particularly promising in the context of vision foundation models, as it does not violate the marginal coverage guarantees across multiple challenging, yet realistic scenarios.

PDF Details

TMLR Journal 2026 Journal Article

Language-Aware Information Maximization for Transductive Few-Shot CLIP

Ghassen Baklouti
Maxime Zanella
Ismail Ben Ayed

Transductive few-shot learning has triggered an abundant literature focusing on vision-only models, but is still at a nascent stage within the recent context of foundational vision-language models (VLMs). Only a few recent methods addressed the problem, pointing to the potential of tranduction in VLMs and to the need for VLM-tailored methods. Building on this momentum, we leverage information-theoretic concepts and recent progress in parameter-efficient fine-tuning (PEFT), developing a highly competitive transductive few-shot CLIP method. Specifically, we introduce a novel Language-aware Information MaximizatiOn (LIMO) loss integrating three complementary terms: (i) the mutual information between the vision inputs and the textual class descriptions; (ii) a Kullback-Leibler (KL) divergence penalizing deviation of the network's probabilistic outputs from the text-driven zero-shot predictions; and (iii) a standard cross-entropy loss based on the labeled shots. Furthermore, we challenge the commonly followed fine-tuning practices in the context of transductive few-shot learning, and explore PEFT strategies, completely overlooked in this context. Surprisingly, we observe substantial boosts in performances, which points to the potential of adapting a subset of the model's parameters in the transductive few-shot setting. We report comprehensive evaluations, which show that LIMO outperforms the very recent transductive few-shot CLIP methods by a large margin and yields significant gains over the best-performing inductive methods. We will publicly release our code.

PDF Details

TMLR Journal 2025 Journal Article

A Strong Baseline for Molecular Few-Shot Learning

Philippe Formont
Hugo Jeannin
Pablo Piantanida
Ismail Ben Ayed

Few-shot learning has recently attracted significant interest in drug discovery, with a recent, fast-growing literature mostly involving convoluted meta-learning strategies. We revisit the more straightforward fine-tuning approach for molecular data, and propose a regularized quadratic-probe loss based on the the Mahalanobis distance. We design a dedicated block-coordinate descent optimizer, which avoid the degenerate solutions of our loss. Interestingly, our simple fine-tuning approach achieves highly competitive performances in comparison to state-of-the-art methods, while being applicable to black-box settings and removing the need for specific episodic pre-training strategies. Furthermore, we introduce a new benchmark to assess the robustness of the competing methods to domain shifts. In this setting, our fine-tuning baseline obtains consistently better results than meta-learning methods.

PDF Details

AAAI Conference 2025 Conference Paper

AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

Antonio Emanuele Cinà
Jérôme Rony
Maura Pintor
Luca Demetrio
Ambra Demontis
Battista Biggio
Ismail Ben Ayed
Fabio Roli

While novel gradient-based attacks are continuously proposed to improve the optimization of adversarial examples, each is shown to outperform its predecessors using different experimental setups, implementations, and computational budgets, leading to biased and unfair comparisons. In this work, we overcome this issue by proposing AttackBench, i.e., an attack evaluation framework that evaluates the effectiveness of each attack (along with its different library implementations) under the same maximum available computational budget. To this end, we (i) define a novel optimality metric that quantifies how close each attack is to the optimal solution (empirically estimated by ensembling all attacks), and (ii) limit the maximum number of forward and backward queries that each attack can execute on the target model. Our extensive experimental analysis compares more than 100 attack implementations over 800 different configurations, considering both CIFAR-10 and ImageNet models, and shows that only a few attack implementations outperform all the remaining approaches. These findings suggest that novel defenses should be evaluated against different attacks than those normally used in the literature to avoid overly-optimistic robustness evaluations. We release AttackBench as a publicly-available benchmark that will be continuously updated with new attack implementations to maintain an up-to-date ranking of the best gradient-based attacks. We release AttackBench as a publicly available benchmark, including a continuously updated leaderboard and source code to maintain an up-to-date ranking of the best gradient-based attacks.

PDF Details DOI

TMLR Journal 2025 Journal Article

GeoMask3D: Geometrically Informed Mask Selection for Self-Supervised Point Cloud Learning in 3D

Ali Bahri
Moslem Yazdanpanah
Mehrdad Noori
Milad Cheraghalikhani
Gustavo Adolfo Vargas Hakim
David OSOWIECHI
Farzad Beizaee
Ismail Ben Ayed

We introduce a novel approach to self-supervised learning for point clouds, employing a geometrically informed mask selection strategy called GeoMask3D (GM3D) to boost the efficiency of Masked Auto Encoders (MAE). Unlike the conventional method of random masking, our technique utilizes a teacher-student model to focus on intricate areas within the data, guiding the model’s focus toward regions with higher geometric complexity. This strategy is grounded in the hypothesis that concentrating on harder patches yields a more robust feature representation, as evidenced by the improved performance on downstream tasks. Our method also presents a feature-level knowledge distillation technique designed to guide the prediction of geometric complexity, which utilizes a comprehensive context from feature-level information. Extensive experiments confirm our method’s superiority over State-Of-The-Art (SOTA) baselines, demonstrating marked improvements in classification, segmentation, and few-shot tasks.

PDF Details

ICML Conference 2025 Conference Paper

SMART-PC: Skeletal Model Adaptation for Robust Test-Time Training in Point Clouds

Ali Bahri
Moslem Yazdanpanah
Sahar Dastani
Mehrdad Noori
Gustavo Adolfo Vargas Hakim
David Osowiechi
Farzad Beizaee
Ismail Ben Ayed

Test-Time Training has emerged as a promising solution to address distribution shifts in 3D point cloud classification. However, existing methods often rely on computationally expensive backpropagation during adaptation, limiting their applicability in real-world, time-sensitive scenarios. In this paper, we introduce SMART-PC, a skeleton-based framework that enhances resilience to corruptions by leveraging the geometric structure of 3D point clouds. During pre-training, our method predicts skeletal representations, enabling the model to extract robust and meaningful geometric features that are less sensitive to corruptions, thereby improving adaptability to test-time distribution shifts. Unlike prior approaches, SMART-PC achieves real-time adaptation by eliminating backpropagation and updating only BatchNorm statistics, resulting in a lightweight and efficient framework capable of achieving high frame-per-second rates while maintaining superior classification performance. Extensive experiments on benchmark datasets, including ModelNet40-C, ShapeNet-C, and ScanObjectNN-C, demonstrate that SMART-PC achieves state-of-the-art results, outperforming existing methods such as MATE in terms of both accuracy and computational efficiency. The implementation is available at: https: //github. com/AliBahri94/SMART-PC.

Details

TMLR Journal 2024 Journal Article

Do not trust what you trust: Miscalibration in Semisupervised Learning

Shambhavi Mishra
Balamurali Murugesan
Ismail Ben Ayed
Marco Pedersoli
Jose Dolz

State-of-the-art semi-supervised learning (SSL) approaches rely on highly confident predictions to serve as pseudo-labels that guide the training on unlabeled samples. An inherent drawback of this strategy stems from the quality of the uncertainty estimates, as pseudo-labels are filtered only based on their degree of uncertainty, regardless of the correctness of their predictions. Thus, assessing and enhancing the uncertainty of network predictions is of paramount importance in the pseudo-labeling process. In this work, we empirically demonstrate that SSL methods based on pseudo-labels are significantly miscalibrated, and formally demonstrate the minimization of the min-entropy, a lower bound of the Shannon entropy, as a potential cause for miscalibration. To alleviate this issue, we integrate a simple penalty term, which enforces the logit distances of the predictions on unlabeled samples to remain low, preventing the network predictions to become overconfident. Comprehensive experiments on a variety of SSL image classification benchmarks demonstrate that the proposed solution systematically improves the calibration performance of relevant SSL models, while also enhancing their discriminative power, being an appealing addition to tackle SSL tasks.

PDF Details

TMLR Journal 2024 Journal Article

Training Graph Neural Networks Subject to a Tight Lipschitz Constraint

Simona Ioana Juvina
Ana Antonia Neacșu
Jérôme Rony
Jean-Christophe Pesquet
Corneliu Burileanu
Ismail Ben Ayed

We propose a strategy for training a wide range of graph neural networks (GNNs) under tight Lipschitz bound constraints. Specifically, by leveraging graph spectral theory, we derive computationally tractable expressions of a tight Lipschitz constant. This allows us to propose a constrained-optimization approach to control the constant, ensuring robustness to adversarial perturbations. Unlike the existing methods for controlling the Lipschitz constant, our approach reduces the size of the handled matrices by a factor equal to the square of the number of nodes in the graph. We employ a stochastic projected subgradient algorithm, which operates in a block-coordinate manner, with the projection step performed via an accelerated iterative proximal algorithm. We focus on defending against attacks that perturb features while keeping the topology of the graph constant. This contrasts with most of the existing defenses, which tackle perturbations of the graph structure. We report experiments on various datasets in the context of node classification tasks, showing the effectiveness of our constrained GNN model.

PDF Details

NeurIPS Conference 2021 Conference Paper

Realistic evaluation of transductive few-shot learning

Olivier Veilleux
Malik Boudiaf
Pablo Piantanida
Ismail Ben Ayed

Transductive inference is widely used in few-shot learning, as it leverages the statistics of the unlabeled query set of a few-shot task, typically yielding substantially better performances than its inductive counterpart. The current few-shot benchmarks use perfectly class-balanced tasks at inference. We argue that such an artificial regularity is unrealistic, as it assumes that the marginal label probability of the testing samples is known and fixed to the uniform distribution. In fact, in realistic scenarios, the unlabeled query sets come with arbitrary and unknown label marginals. We introduce and study the effect of arbitrary class distributions within the query sets of few-shot tasks at inference, removing the class-balance artefact. Specifically, we model the marginal probabilities of the classes as Dirichlet-distributed random variables, which yields a principled and realistic sampling within the simplex. This leverages the current few-shot benchmarks, building testing tasks with arbitrary class distributions. We evaluate experimentally state-of-the-art transductive methods over 3 widely used data sets, and observe, surprisingly, substantial performance drops, even below inductive methods in some cases. Furthermore, we propose a generalization of the mutual-information loss, based on α-divergences, which can handle effectively class-distribution variations. Empirically, we show that our transductive α-divergence optimization outperforms state-of-the-art methods across several data sets, models and few-shot settings.

PDF Details

AAAI Conference 2021 Conference Paper

Variational Fair Clustering

Imtiaz Masud Ziko
Jing Yuan
Eric Granger
Ismail Ben Ayed

We propose a general variational framework of fair clustering, which integrates an original Kullback-Leibler (KL) fairness term with a large class of clustering objectives, including prototype or graph based. Fundamentally different from the existing combinatorial and spectral solutions, our variational multiterm approach enables to control the trade-off levels between the fairness and clustering objectives. We derive a general tight upper bound based on a concave-convex decomposition of our fairness term, its Lipschitz-gradient property and the Pinsker’s inequality. Our tight upper bound can be jointly optimized with various clustering objectives, while yielding a scalable solution, with convergence guarantee. Interestingly, at each iteration, it performs an independent update for each assignment variable. Therefore, it can be easily distributed for large-scale datasets. This scalability is important as it enables to explore different trade-off levels between the fairness and clustering objectives. Unlike spectral relaxation, our formulation does not require computing its eigenvalue decomposition. We report comprehensive evaluations and comparisons with state-of-the-art methods over various fair clustering benchmarks, which show that our variational formulation can yield highly competitive solutions in terms of fairness and clustering objectives.

PDF Details

NeurIPS Conference 2020 Conference Paper

Information Maximization for Few-Shot Learning

Malik Boudiaf
Imtiaz Ziko
Jérôme Rony
Jose Dolz
Pablo Piantanida
Ismail Ben Ayed

We introduce Transductive Infomation Maximization (TIM) for few-shot learning. Our method maximizes the mutual information between the query features and their label predictions for a given few-shot task, in conjunction with a supervision loss based on the support set. Furthermore, we propose a new alternating-direction solver for our mutual-information loss, which substantially speeds up transductive inference convergence over gradient-based optimization, while yielding similar accuracy. TIM inference is modular: it can be used on top of any base-training feature extractor. Following standard transductive few-shot settings, our comprehensive experiments demonstrate that TIM outperforms state-of-the-art methods significantly across various datasets and networks, while used on top of a fixed feature extractor trained with simple cross-entropy on the base classes, without resorting to complex meta-learning schemes. It consistently brings between 2% and 5% improvement in accuracy over the best performing method, not only on all the well-established few-shot benchmarks but also on more challenging scenarios, with domain shifts and larger numbers of classes.

PDF Details

ICML Conference 2020 Conference Paper

Laplacian Regularized Few-Shot Learning

Imtiaz Masud Ziko
Jose Dolz
Eric Granger
Ismail Ben Ayed

We propose a transductive Laplacian-regularized inference for few-shot tasks. Given any feature embedding learned from the base classes, we minimize a quadratic binary-assignment function containing two terms: (1) a unary term assigning query samples to the nearest class prototype, and (2) a pairwise Laplacian term encouraging nearby query samples to have consistent label assignments. Our transductive inference does not re-train the base model, and can be viewed as a graph clustering of the query set, subject to supervision constraints from the support set. We derive a computationally efficient bound optimizer of a relaxation of our function, which computes independent (parallel) updates for each query sample, while guaranteeing convergence. Following a simple cross-entropy training on the base classes, and without complex meta-learning strategies, we conducted comprehensive experiments over five few-shot learning benchmarks. Our LaplacianShot consistently outperforms state-of-the-art methods by significant margins across different models, settings, and data sets. Furthermore, our transductive inference is very fast, with computational times that are close to inductive inference, and can be used for large-scale few-shot tasks.

Details

YNIMG Journal 2018 Journal Article

3D fully convolutional networks for subcortical segmentation in MRI: A large-scale study

Jose Dolz
Christian Desrosiers
Ismail Ben Ayed

Details DOI

YNIMG Journal 2018 Journal Article

Comparing fully automated state-of-the-art cerebellum parcellation from magnetic resonance images

Aaron Carass
Jennifer L. Cuzzocreo
Shuo Han
Carlos R. Hernandez-Castillo
Paul E. Rasser
Melanie Ganz
Vincent Beliveau
Jose Dolz

Details DOI

NeurIPS Conference 2018 Conference Paper

Scalable Laplacian K-modes

Imtiaz Ziko
Eric Granger
Ismail Ben Ayed

We advocate Laplacian K-modes for joint clustering and density mode finding, and propose a concave-convex relaxation of the problem, which yields a parallel algorithm that scales up to large datasets and high dimensions. We optimize a tight bound (auxiliary function) of our relaxation, which, at each iteration, amounts to computing an independent update for each cluster-assignment variable, with guar- anteed convergence. Therefore, our bound optimizer can be trivially distributed for large-scale data sets. Furthermore, we show that the density modes can be obtained as byproducts of the assignment variables via simple maximum-value operations whose additional computational cost is linear in the number of data points. Our formulation does not need storing a full affinity matrix and computing its eigenvalue decomposition, neither does it perform expensive projection steps and Lagrangian-dual inner iterates for the simplex constraints of each point. Fur- thermore, unlike mean-shift, our density-mode estimation does not require inner- loop gradient-ascent iterates. It has a complexity independent of feature-space dimension, yields modes that are valid data points in the input set and is appli- cable to discrete domains as well as arbitrary kernels. We report comprehensive experiments over various data sets, which show that our algorithm yields very competitive performances in term of optimization quality (i. e. , the value of the discrete-variable objective at convergence) and clustering accuracy.

PDF Details