Arrow Research search

Author name cluster

Peter Nickl

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
2 author rows

Possible papers

3

ICML Conference 2024 Conference Paper

Variational Learning is Effective for Large Deep Networks

  • Yuesong Shen
  • Nico Daheim
  • Bai Cong
  • Peter Nickl
  • Gian Maria Marconi
  • Clement Bazan
  • Rio Yokota
  • Iryna Gurevych

We give extensive empirical evidence against the common belief that variational learning is ineffective for large neural networks. We show that an optimizer called Improved Variational Online Newton (IVON) consistently matches or outperforms Adam for training large networks such as GPT-2 and ResNets from scratch. IVON’s computational costs are nearly identical to Adam but its predictive uncertainty is better. We show several new use cases of IVON where we improve finetuning and model merging in Large Language Models, accurately predict generalization error, and faithfully estimate sensitivity to data. We find overwhelming evidence that variational learning is effective. Code is available at https: //github. com/team-approx-bayes/ivon.

NeurIPS Conference 2023 Conference Paper

The Memory-Perturbation Equation: Understanding Model's Sensitivity to Data

  • Peter Nickl
  • Lu Xu
  • Dharmesh Tailor
  • Thomas Möllenhoff
  • Mohammad Emtiyaz Khan

Understanding model’s sensitivity to its training data is crucial but can also be challenging and costly, especially during training. To simplify such issues, we present the Memory-Perturbation Equation (MPE) which relates model's sensitivity to perturbation in its training data. Derived using Bayesian principles, the MPE unifies existing sensitivity measures, generalizes them to a wide-variety of models and algorithms, and unravels useful properties regarding sensitivities. Our empirical results show that sensitivity estimates obtained during training can be used to faithfully predict generalization on unseen test data. The proposed equation is expected to be useful for future research on robust and adaptive learning.

ICRA Conference 2021 Conference Paper

A Variational Infinite Mixture for Probabilistic Inverse Dynamics Learning

  • Hany Abdulsamad
  • Peter Nickl
  • Pascal Klink
  • Jan Peters 0001

Probabilistic regression techniques in control and robotics applications have to fulfill different criteria of data-driven adaptability, computational efficiency, scalability to high dimensions, and the capacity to deal with different modalities in the data. Classical regressors usually fulfill only a subset of these properties. In this work, we extend seminal work on Bayesian nonparametric mixtures and derive an efficient variational Bayes inference technique for infinite mixtures of probabilistic local polynomial models with well-calibrated certainty quantification. We highlight the model’s power in combining data-driven complexity adaptation, fast prediction, and the ability to deal with discontinuous functions and heteroscedastic noise. We benchmark this technique on a range of large real-world inverse dynamics datasets, showing that the infinite mixture formulation is competitive with classical Local Learning methods and regularizes model complexity by adapting the number of components based on data and without relying on heuristics. Moreover, to showcase the practicality of the approach, we use the learned models for online inverse dynamics control of a Barrett-WAM manipulator, significantly improving the trajectory tracking performance.