Author name cluster

Daniel Rueckert

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

66 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Gradient-Weight Alignment as a Train-Time Proxy for Generalization in Classification Tasks

Florian Hölzl
Daniel Rueckert
Georgios Kaissis

Robust validation metrics remain essential in contemporary deep learning, not only to detect overfitting and poor generalization, but also to monitor training dynamics. In the supervised classification setting, we investigate whether interactions between training data and model weights can yield such a metric that both tracks generalization during training and attributes performance to individual training samples. We introduce Gradient-Weight Alignment (GWA), quantifying the coherence between per-sample gradients and model weights. We show that effective learning corresponds to coherent alignment, while misalignment indicates deteriorating generalization. GWA is efficiently computable during training and reflects both sample-specific contributions and dataset-wide learning dynamics. Extensive experiments show that GWA accurately predicts optimal early stopping, enables principled model comparisons, and identifies influential training samples, providing a validation-set-free approach for model analysis directly from the training data.

PDF Details

TMLR Journal 2025 Journal Article

Improved Localized Machine Unlearning Through the Lens of Memorization

Reihaneh Torkzadehmahani
Reza Nasirigerdeh
Georgios Kaissis
Daniel Rueckert
Gintare Karolina Dziugaite
Eleni Triantafillou

Machine unlearning refers to removing the influence of a specified subset of training data from a model efficiently, after it has already been trained. This is important for key applications, including making the model more accurate by removing outdated, mislabeled, or poisoned data. In this paper, we draw inspiration from prior work that attempts to identify where in the network a given example is memorized, to propose a new "localized unlearning" algorithm, Deletion by Example Localization (DEL). DEL has two components: a localization strategy that identifies critical parameters for a given set of examples, and a simple unlearning algorithm that finetunes only the critical parameters on the data we want to retain. Through extensive experiments, we find that our localization strategy outperforms prior strategies in terms of metrics of interest for unlearning and test accuracy, and pairs well with various unlearning algorithms. Our experiments on different datasets, forget sets, and metrics reveal that DEL outperforms prior work in producing better trade-offs between unlearning performance and accuracy.

PDF Details

ICLR Conference 2025 Conference Paper

Laplace Sample Information: Data Informativeness Through a Bayesian Lens

Johannes Kaiser
Kristian Schwethelm
Daniel Rueckert
Georgios Kaissis

Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose $\text{\textit{Laplace Sample Information}}$ ($\mathsf{LSI}$) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings. $\mathsf{LSI}$ leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset. We experimentally show that $\mathsf{LSI}$ is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty. We demonstrate these capabilities of $\mathsf{LSI}$ on image and text data in supervised and unsupervised settings. Moreover, we show that $\mathsf{LSI}$ can be computed efficiently through probes and transfers well to the training of large models.

Details

ICLR Conference 2025 Conference Paper

SIM: Surface-based fMRI Analysis for Inter-Subject Multimodal Decoding from Movie-Watching Experiments

Simon Dahan
Gabriel Bénédict
Logan Zane John Williams
Yourong Guo
Daniel Rueckert
Robert Leech
Emma Claire Robinson

Current AI frameworks for brain decoding and encoding, typically train and test models within the same datasets. This limits their utility for cognitive training (neurofeedback) for which it would be useful to pool experiences across individuals to better simulate stimuli not sampled during training. A key obstacle to model generalisation is the degree of variability of inter-subject cortical organisation, which makes it difficult to align or compare cortical signals across participants. In this paper we address this through use of surface vision transformers, which build a generalisable model of cortical functional dynamics, through encoding the topography of cortical networks and their interactions as a moving image across a surface. This is then combined with tri-modal self-supervised contrastive (CLIP) alignment of audio, video, and fMRI modalities to enable the retrieval of visual and auditory stimuli from patterns of cortical activity (and vice-versa). We validate our approach on 7T task-fMRI data from 174 healthy participants engaged in the movie-watching experiment from the Human Connectome Project (HCP). Results show that it is possible to detect which movie clips an individual is watching purely from their brain activity, even for individuals and movies *not seen during training*. Further analysis of attention maps reveals that our model captures individual patterns of brain activity that reflect semantic and visual systems. This opens the door to future personalised simulations of brain function. Code \& pre-trained models will be made available at https://github.com/metrics-lab/sim.

Details

ICLR Conference 2025 Conference Paper

Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Laurin Lux
Alexander H. Berger
Alexander Weers
Nico Stucki
Daniel Rueckert
Ulrich Bauer
Johannes C. Paetzold

Topological correctness plays a critical role in many image segmentation tasks, yet most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy. Existing topology-aware methods often lack robust topological guarantees, are limited to specific use cases, or impose high computational costs. In this work, we propose a novel, graph-based framework for topologically accurate image segmentation that is both computationally efficient and generally applicable. Our method constructs a component graph that fully encodes the topological information of both the prediction and ground truth, allowing us to efficiently identify topologically critical regions and aggregate a loss based on local neighborhood information. Furthermore, we introduce a strict topological metric capturing the homotopy equivalence between the union and intersection of prediction-label pairs. We formally prove the topological guarantees of our approach and empirically validate its effectiveness on binary and multi-class datasets, demonstrating state-of-the-art performance with up to fivefold faster loss computation compared to persistent homology methods.

Details

TMLR Journal 2025 Journal Article

Visual Privacy Auditing with Diffusion Models

Kristian Schwethelm
Johannes Kaiser
Moritz Knolle
Sarah Lockfisch
Daniel Rueckert
Alexander Ziller

Data reconstruction attacks on machine learning models pose a substantial threat to privacy, potentially leaking sensitive information. Although defending against such attacks using differential privacy (DP) provides theoretical guarantees, determining appropriate DP parameters remains challenging. Current formal guarantees on the success of data reconstruction suffer from overly stringent assumptions regarding adversary knowledge about the target data, particularly in the image domain, raising questions about their real-world applicability. In this work, we empirically investigate this discrepancy by introducing a reconstruction attack based on diffusion models (DMs) that only assumes adversary access to real-world image priors and specifically targets the DP defense. We find that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as heuristic auditing tools for visualizing privacy leakage.

PDF Details

TMLR Journal 2024 Journal Article

A Survey on Graph Construction for Geometric Deep Learning in Medicine: Methods and Recommendations

Tamara T. Müller
Sophie Starck
Alina Dima
Stephan Wunderlich
Kyriaki-Margarita Bintsi
Kamilia Zaripova
Rickmer Braren
Daniel Rueckert

Graph neural networks are powerful tools that enable deep learning on non-Euclidean data structures like graphs, point clouds, and meshes. They leverage the connectivity of data points and can even benefit learning tasks on data, which is not naturally graph-structured -like point clouds. In these cases, the graph structure needs to be determined from the dataset, which adds a significant challenge to the learning process. This opens up a multitude of design choices for creating suitable graph structures, which have a substantial impact on the success of the graph learning task. However, so far no concrete guidance for choosing the most appropriate graph construction is available, not only due to the large variety of methods out there but also because of its strong connection to the dataset at hand. In medicine, for example, a large variety of different data types complicates the selection of graph construction methods even more. We therefore summarise the current state-of-the-art graph construction methods, especially for medical data. In this work, we introduce a categorisation scheme for graph types and graph construction methods. We identify two main strands of graph construction: static and adaptive methods, discuss their advantages and disadvantages, and formulate recommendations for choosing a suitable graph construction method. We furthermore discuss how a created graph structure can be assessed and to what degree it supports graph learning. We hope to support medical research with graph deep learning with this work by elucidating the wide variety of graph construction methods.

PDF Details

TMLR Journal 2024 Journal Article

Are Population Graphs Really as Powerful as Believed?

Tamara T. Müller
Sophie Starck
Kyriaki-Margarita Bintsi
Alexander Ziller
Rickmer Braren
Georgios Kaissis
Daniel Rueckert

Population graphs and their use in combination with graph neural networks (GNNs) have demonstrated promising results for multi-modal medical data integration and improving disease diagnosis and prognosis. Several different methods for constructing these graphs and advanced graph learning techniques have been established to maximise the predictive power of GNNs on population graphs. However, in this work, we raise the question of whether existing methods are really strong enough by showing that simple baseline methods --such as random forests or linear regressions--, perform on par with advanced graph learning models on several population graph datasets for a variety of different clinical applications. We use the commonly used public population graph datasets TADPOLE and ABIDE, a brain age estimation and a cardiac dataset from the UK Biobank, and a real-world in-house COVID dataset. We (a) investigate the impact of different graph construction methods, graph convolutions, and dataset size and complexity on GNN performance and (b) discuss the utility of GNNs for multi-modal data integration in the context of population graphs. Based on our results, we argue towards the need for "better" graph construction methods or innovative applications for population graphs to render them beneficial.

PDF Details

ICML Conference 2024 Conference Paper

Beyond the Calibration Point: Mechanism Comparison in Differential Privacy

Georgios Kaissis
Stefan Kolek
Borja Balle
Jamie Hayes
Daniel Rueckert

In differentially private (DP) machine learning, the privacy guarantees of DP mechanisms are often reported and compared on the basis of a single $(\varepsilon, \delta)$-pair. This practice overlooks that DP guarantees can vary substantially even between mechanisms sharing a given $(\varepsilon, \delta)$, and potentially introduces privacy vulnerabilities which can remain undetected. This motivates the need for robust, rigorous methods for comparing DP guarantees in such cases. Here, we introduce the $\Delta$-divergence between mechanisms which quantifies the worst-case excess privacy vulnerability of choosing one mechanism over another in terms of $(\varepsilon, \delta)$, $f$-DP and in terms of a newly presented Bayesian interpretation. Moreover, as a generalisation of the Blackwell theorem, it is endowed with strong decision-theoretic foundations. Through application examples, we show that our techniques can facilitate informed decision-making and reveal gaps in the current understanding of privacy risks, as current practices in DP-SGD often result in choosing mechanisms with high excess privacy vulnerabilities.

Details

TMLR Journal 2024 Journal Article

Kernel Normalized Convolutional Networks

Reza Nasirigerdeh
Reihaneh Torkzadehmahani
Daniel Rueckert
Georgios Kaissis

Existing convolutional neural network architectures frequently rely upon batch normalization (BatchNorm) to effectively train the model. BatchNorm, however, performs poorly with small batch sizes, and is inapplicable to differential privacy. To address these limitations, we propose the kernel normalization (KernelNorm) and kernel normalized convolutional layers, and incorporate them into kernel normalized convolutional networks (KNConvNets) as the main building blocks. We implement KNConvNets corresponding to the state-of-the-art ResNets while forgoing the BatchNorm layers. Through extensive experiments, we illustrate that KNConvNets achieve higher or competitive performance compared to the BatchNorm counterparts in image classification and semantic segmentation. They also significantly outperform their batch-independent competitors including those based on layer and group normalization in non-private and differentially private training. Given that, KernelNorm combines the batch-independence property of layer and group normalization with the performance advantage of BatchNorm.

PDF Details

YNICL Journal 2024 Journal Article

LST-AI: A deep learning ensemble for accurate MS lesion segmentation

Tun Wiltgen
Julian McGinnis
Sarah Schlaeger
Florian Kofler
CuiCi Voon
Achim Berthele
Daria Bischl
Lioba Grundl