Arrow Research search

Author name cluster

David Brellmann

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
2 author rows

Possible papers

3

NeurIPS Conference 2025 Conference Paper

Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the Role of Model Complexity

  • Mouïn Ben Ammar
  • David Brellmann
  • Arturo Mendoza
  • Antoine Manzanera
  • Gianni Franchi

Out-of-distribution (OOD) detection is essential for ensuring the reliability and safety of machine learning systems. In recent years, it has received increasing attention, particularly through post-hoc detection and training-based methods. In this paper, we focus on post-hoc OOD detection, which enables identifying OOD samples without altering the model's training procedure or objective. Our primary goal is to investigate the relationship between model capacity and its OOD detection performance. Specifically, we aim to answer the following question: Does the Double Descent phenomenon manifest in post-hoc OOD detection? This question is crucial, as it can reveal whether overparameterization, which is already known to benefit generalization, can also enhance OOD detection. Despite the growing interest in these topics by the classic supervised machine learning community, this intersection remains unexplored for OOD detection. We empirically demonstrate that the Double Descent effect does indeed appear in post-hoc OOD detection. Furthermore, we provide theoretical insights to explain why this phenomenon emerges in such setting. Finally, we show that the overparameterized regime does not yield superior results consistently, and we propose a method to identify the optimal regime for OOD detection based on our observations.

ICLR Conference 2024 Conference Paper

On Double Descent in Reinforcement Learning with LSTD and Random Features

  • David Brellmann
  • Eloïse Berthier
  • David Filliat
  • Goran Frehse

Temporal Difference (TD) algorithms are widely used in Deep Reinforcement Learning (RL). Their performance is heavily influenced by the size of the neural network. While in supervised learning, the regime of over-parameterization and its benefits are well understood, the situation in RL is much less clear. In this paper, we present a theoretical analysis of the influence of network size and $l_2$-regularization on performance. We identify the ratio between the number of parameters and the number of visited states as a crucial factor and define over-parameterization as the regime when it is larger than one. Furthermore, we observe a double descent phenomenon, i.e., a sudden drop in performance around the parameter/state ratio of one. Leveraging random features and the lazy training regime, we study the regularized Least-Square Temporal Difference (LSTD) algorithm in an asymptotic regime, as both the number of parameters and states go to infinity, maintaining a constant ratio. We derive deterministic limits of both the empirical and the true Mean-Squared Bellman Error (MSBE) that feature correction terms responsible for the double descent. Correction terms vanish when the $l_2$-regularization is increased or the number of unvisited states goes to zero. Numerical experiments with synthetic and small real-world environments closely match the theoretical predictions.

TMLR Journal 2023 Journal Article

Fourier Features in Reinforcement Learning with Neural Networks

  • David Brellmann
  • David Filliat
  • Goran Frehse

In classic Reinforcement Learning (RL), the performance of algorithms depends critically on data representation, i.e., the way the states of the system are represented as features. Choosing appropriate features for a task is an important way of adding prior domain knowledge since cleverly distributing information into states facilitates appropriate generalization. For linear function approximations, the representation is usually hand-designed according to the task at hand and projected into a higher-dimensional space to facilitate linear separation. Among the feature encodings used in RL for linear function approximation, we can mention in a non-exhaustive way Polynomial Features or Tile Coding. However, the main bottleneck of such feature encodings is that they do not scale to high-dimensional inputs as they grow exponentially in size with the input dimension.