Author name cluster

Peter Nickl

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

ICML Conference 2024 Conference Paper

Variational Learning is Effective for Large Deep Networks

Yuesong Shen
Nico Daheim
Bai Cong
Peter Nickl
Gian Maria Marconi
Clement Bazan
Rio Yokota
Iryna Gurevych

We give extensive empirical evidence against the common belief that variational learning is ineffective for large neural networks. We show that an optimizer called Improved Variational Online Newton (IVON) consistently matches or outperforms Adam for training large networks such as GPT-2 and ResNets from scratch. IVON’s computational costs are nearly identical to Adam but its predictive uncertainty is better. We show several new use cases of IVON where we improve finetuning and model merging in Large Language Models, accurately predict generalization error, and faithfully estimate sensitivity to data. We find overwhelming evidence that variational learning is effective. Code is available at https: //github. com/team-approx-bayes/ivon.

Details

NeurIPS Conference 2023 Conference Paper

The Memory-Perturbation Equation: Understanding Model's Sensitivity to Data

Peter Nickl
Lu Xu
Dharmesh Tailor
Thomas Möllenhoff
Mohammad Emtiyaz Khan

Understanding model’s sensitivity to its training data is crucial but can also be challenging and costly, especially during training. To simplify such issues, we present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data. Derived using Bayesian principles, the MPE unifies existing sensitivity measures, generalizes them to a wide-variety of models and algorithms, and unravels useful properties regarding sensitivities. Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data. The proposed equation is expected to be useful for future research on robust and adaptive learning.

PDF Details

ICRA Conference 2021 Conference Paper

A Variational Infinite Mixture for Probabilistic Inverse Dynamics Learning

Hany Abdulsamad
Peter Nickl
Pascal Klink
Jan Peters 0001

Probabilistic regression techniques in control and robotics applications have to fulfill different criteria of data-driven adaptability, computational efficiency, scalability to high dimensions, and the capacity to deal with different modalities in the data. Classical regressors usually fulfill only a subset of these properties. In this work, we extend seminal work on Bayesian nonparametric mixtures and derive an efficient variational Bayes inference technique for infinite mixtures of probabilistic local polynomial models with well-calibrated certainty quantification. We highlight the model’s power in combining data-driven complexity adaptation, fast prediction, and the ability to deal with discontinuous functions and heteroscedastic noise. We benchmark this technique on a range of large real-world inverse dynamics datasets, showing that the infinite mixture formulation is competitive with classical Local Learning methods and regularizes model complexity by adapting the number of components based on data and without relying on heuristics. Moreover, to showcase the practicality of the approach, we use the learned models for online inverse dynamics control of a Barrett-WAM manipulator, significantly improving the trajectory tracking performance.

Details