Arrow Research search

Author name cluster

Thomas B. Schön

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

19 papers
2 author rows

Possible papers

19

EWRL Workshop 2025 Workshop Paper

Reinforcement learning with non-ergodic reward increments: robustness via ergodicity transformations

  • Dominik Baumann
  • Erfaun Noorani
  • James Price
  • Ole Peters
  • Colm Connaughton
  • Thomas B. Schön

Envisioned application areas for reinforcement learning (RL) include autonomous driving, precision agriculture, and finance, which all require RL agents to make decisions in the real world. A significant challenge hindering the adoption of RL methods in these domains is the non-robustness of conventional algorithms. In particular, the focus of RL is typically on the expected value of the return. The expected value is the average over the statistical ensemble of infinitely many trajectories, which can be uninformative about the performance of the average individual. For instance, when we have a heavy-tailed return distribution, the ensemble average can be dominated by rare extreme events. Consequently, optimizing the expected value can lead to policies that yield exceptionally high returns with a probability that approaches zero but almost surely result in catastrophic outcomes in single long trajectories. In this paper, we develop an algorithm that lets RL agents optimize the long-term performance of individual trajectories. The algorithm enables the agents to learn robust policies, which we show in an instructive example with a heavy-tailed return distribution and standard RL benchmarks. The key element of the algorithm is a transformation that we learn from data. This transformation turns the time series of collected returns into one for whose increments expected value and the average over a long trajectory coincide. Optimizing these increments results in robust policies.

TMLR Journal 2025 Journal Article

Reinforcement learning with non-ergodic reward increments: robustness via ergodicity transformations

  • Dominik Baumann
  • Erfaun Noorani
  • James Price
  • Ole Peters
  • Colm Connaughton
  • Thomas B. Schön

Envisioned application areas for reinforcement learning (RL) include autonomous driving, precision agriculture, and finance, which all require RL agents to make decisions in the real world. A significant challenge hindering the adoption of RL methods in these domains is the non-robustness of conventional algorithms. In particular, the focus of RL is typically on the expected value of the return. The expected value is the average over the statistical ensemble of infinitely many trajectories, which can be uninformative about the performance of the average individual. For instance, when we have a heavy-tailed return distribution, the ensemble average can be dominated by rare extreme events. Consequently, optimizing the expected value can lead to policies that yield exceptionally high returns with a probability that approaches zero but almost surely result in catastrophic outcomes in single long trajectories. In this paper, we develop an algorithm that lets RL agents optimize the long-term performance of individual trajectories. The algorithm enables the agents to learn robust policies, which we show in an instructive example with a heavy-tailed return distribution and standard RL benchmarks. The key element of the algorithm is a transformation that we learn from data. This transformation turns the time series of collected returns into one for whose increments expected value and the average over a long trajectory coincide. Optimizing these increments results in robust policies.

EWRL Workshop 2025 Workshop Paper

Safe exploration in reproducing kernel Hilbert spaces

  • Abdullah Tokmak
  • Kiran G. Krishnan
  • Thomas B. Schön
  • Dominik Baumann

Popular safe Bayesian optimization (BO) algorithms learn control policies for safety-critical systems in unknown environments. However, most algorithms make a smoothness assumption, which is encoded by a known bounded norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it remains unclear how to reliably obtain the RKHS norm of an unknown function. In this work, we propose a safe BO algorithm capable of estimating the RKHS norm from data. We provide statistical guarantees on the RKHS norm estimation, integrate the estimated RKHS norm into existing confidence intervals and show that we retain theoretical guarantees, and prove safety of the resulting safe BO algorithm. We apply our algorithm to safely optimize reinforcement learning policies on physics simulators and on a real inverted pendulum, demonstrating improved performance, safety, and scalability compared to the state-of-the-art.

ICLR Conference 2024 Conference Paper

Controlling Vision-Language Models for Multi-Task Image Restoration

  • Ziwei Luo 0002
  • Fredrik K. Gustafsson
  • Zheng Zhao 0004
  • Jens Sjölund
  • Thomas B. Schön

Vision-language models such as CLIP have shown great impact on diverse downstream tasks for zero-shot or label-free predictions. However, when it comes to low-level vision such as image restoration their performance deteriorates dramatically due to corrupted inputs. In this paper, we present a degradation-aware vision-language model (DA-CLIP) to better transfer pretrained vision-language models to low-level vision tasks as a multi-task framework for image restoration. More specifically, DA-CLIP trains an additional controller that adapts the fixed CLIP image encoder to predict high-quality feature embeddings. By integrating the embedding into an image restoration network via cross-attention, we are able to pilot the model to learn a high-fidelity image reconstruction. The controller itself will also output a degradation feature that matches the real corruptions of the input, yielding a natural classifier for different degradation types. In addition, we construct a mixed degradation dataset with synthetic captions for DA-CLIP training. Our approach advances state-of-the-art performance on both degradation-specific and unified image restoration tasks, showing a promising direction of prompting image restoration with large-scale pretrained vision-language models. Our code is available at https://github.com/Algolzw/daclip-uir.

NeurIPS Conference 2024 Conference Paper

Entropy-regularized Diffusion Policy with Q-Ensembles for Offline Reinforcement Learning

  • Ruoqi Zhang
  • Ziwei Luo
  • Jens Sjölund
  • Thomas B. Schön
  • Per Mattsson

Diffusion policy has shown a strong ability to express complex action distributions in offline reinforcement learning (RL). However, it suffers from overestimating Q-value functions on out-of-distribution (OOD) data points due to the offline dataset limitation. To address it, this paper proposes a novel entropy-regularized diffusion policy and takes into account the confidence of the Q-value prediction with Q-ensembles. At the core of our diffusion policy is a mean-reverting stochastic differential equation (SDE) that transfers the action distribution into a standard Gaussian form and then samples actions conditioned on the environment state with a corresponding reverse-time process. We show that the entropy of such a policy is tractable and that can be used to increase the exploration of OOD samples in offline RL training. Moreover, we propose using the lower confidence bound of Q-ensembles for pessimistic Q-value function estimation. The proposed approach demonstrates state-of-the-art performance across a range of tasks in the D4RL benchmarks, significantly improving upon existing diffusion-based policies. The code is available at https: //github. com/ruoqizzz/entropy-offlineRL.

ICML Conference 2024 Conference Paper

No Double Descent in Principal Component Regression: A High-Dimensional Analysis

  • Daniel Gedon
  • Antônio H. Ribeiro
  • Thomas B. Schön

Understanding the generalization properties of large-scale models necessitates incorporating realistic data assumptions into the analysis. Therefore, we consider Principal Component Regression (PCR)—combining principal component analysis and linear regression—on data from a low-dimensional manifold. We present an analysis of PCR when the data is sampled from a spiked covariance model, obtaining fundamental asymptotic guarantees for the generalization risk of this model. Our analysis is based on random matrix theory and allows us to provide guarantees for high-dimensional data. We additionally present an analysis of the distribution shift between training and test data. The results allow us to disentangle the effects of (1) the number of parameters, (2) the data-generating model and, (3) model misspecification on the generalization risk. The use of PCR effectively regularizes the model and prevents the interpolation peak of the double descent. Our theoretical findings are empirically validated in simulation, demonstrating their practical relevance.

UAI Conference 2024 Conference Paper

Uncertainty Estimation with Recursive Feature Machines

  • Daniel Gedon
  • Amirhesam Abedsoltan
  • Thomas B. Schön
  • Mikhail Belkin

In conventional regression analysis, predictions are typically represented as point estimates derived from covariates. The Gaussian Process (GP) offer a kernel-based framework that predicts and quantifies associated uncertainties. However, kernel-based methods often underperform ensemble-based decision tree approaches in regression tasks involving tabular and categorical data. Recently, Recursive Feature Machines (RFMs) were proposed as a novel feature-learning kernel which strengthens the capabilities of kernel machines. In this study, we harness the power of these RFMs in a probabilistic GP-based approach to enhance uncertainty estimation through feature extraction within kernel methods. We employ this learned kernel for in-depth uncertainty analysis. On tabular datasets, our RFM-based method surpasses other leading uncertainty estimation techniques, including NGBoost and CatBoost-ensemble. Additionally, when assessing out-of-distribution performance, we found that boosting-based methods are surpassed by our RFM-based approach.

TMLR Journal 2023 Journal Article

How Reliable is Your Regression Model's Uncertainty Under Real-World Distribution Shifts?

  • Fredrik K. Gustafsson
  • Martin Danelljan
  • Thomas B. Schön

Many important computer vision applications are naturally formulated as regression problems. Within medical imaging, accurate regression models have the potential to automate various tasks, helping to lower costs and improve patient outcomes. Such safety-critical deployment does however require reliable estimation of model uncertainty, also under the wide variety of distribution shifts that might be encountered in practice. Motivated by this, we set out to investigate the reliability of regression uncertainty estimation methods under various real-world distribution shifts. To that end, we propose an extensive benchmark of 8 image-based regression datasets with different types of challenging distribution shifts. We then employ our benchmark to evaluate many of the most common uncertainty estimation methods, as well as two state-of-the-art uncertainty scores from the task of out-of-distribution detection. We find that while methods are well calibrated when there is no distribution shift, they all become highly overconfident on many of the benchmark datasets. This uncovers important limitations of current uncertainty estimation methods, and the proposed benchmark therefore serves as a challenge to the research community. We hope that our benchmark will spur more work on how to develop truly reliable regression uncertainty estimation methods.

ICML Conference 2023 Conference Paper

Image Restoration with Mean-Reverting Stochastic Differential Equations

  • Ziwei Luo 0002
  • Fredrik K. Gustafsson
  • Zheng Zhao 0004
  • Jens Sjölund
  • Thomas B. Schön

This paper presents a stochastic differential equation (SDE) approach for general-purpose image restoration. The key construction consists in a mean-reverting SDE that transforms a high-quality image into a degraded counterpart as a mean state with fixed Gaussian noise. Then, by simulating the corresponding reverse-time SDE, we are able to restore the origin of the low-quality image without relying on any task-specific prior knowledge. Crucially, the proposed mean-reverting SDE has a closed-form solution, allowing us to compute the ground truth time-dependent score and learn it with a neural network. Moreover, we propose a maximum likelihood objective to learn an optimal reverse trajectory that stabilizes the training and improves the restoration results. The experiments show that our proposed method achieves highly competitive performance in quantitative comparisons on image deraining, deblurring, and denoising, setting a new state-of-the-art on two deraining datasets. Finally, the general applicability of our approach is further demonstrated via qualitative results on image super-resolution, inpainting, and dehazing. Code is available at https: //github. com/Algolzw/image-restoration-sde.

TMLR Journal 2023 Journal Article

Online Learning for Prediction via Covariance Fitting: Computation, Performance and Robustness

  • Muhammad Osama
  • Dave Zachariah
  • Peter Stoica
  • Thomas B. Schön

We consider the problem of online prediction using linear smoothers that are functions of a nominal covariance model with unknown parameters. The model parameters are often learned using cross-validation or maximum-likelihood techniques. But when training data arrives in a streaming fashion, the implementation of such techniques can only be done in an approximate manner. Even if this limitation could be overcome, there appears to be no clear-cut results on the statistical properties of the resulting predictor. Here we consider a covariance-fitting method to learn the model parameters, which was initially developed for spectral estimation. We first show that the use of this approach results in a computationally efficient online learning method in which the resulting predictor can be updated sequentially. We then prove that, with high probability, its out-of-sample error approaches the optimal level at a root-$n$ rate, where $n$ is the number of data samples. This is so even if the nominal covariance model is misspecified. Moreover, we show that the resulting predictor enjoys two robustness properties. First, it corresponds to a predictor that minimizes the out-of-sample error with respect to the least favourable distribution within a given Wasserstein distance from the empirical distribution. Second, it is robust against errors in the covariate training data. We illustrate the performance of the proposed method in a numerical experiment.

TMLR Journal 2023 Journal Article

Variational Elliptical Processes

  • Maria Margareta Bånkestad
  • Jens Sjölund
  • Jalil Taghia
  • Thomas B. Schön

We present elliptical processes—a family of non-parametric probabilistic models that subsumes Gaussian processes and Student's t processes. This generalization includes a range of new heavy-tailed behaviors while retaining computational tractability. Elliptical processes are based on a representation of elliptical distributions as a continuous mixture of Gaussian distributions. We parameterize this mixture distribution as a spline normalizing flow, which we train using variational inference. The proposed form of the variational posterior enables a sparse variational elliptical process applicable to large-scale problems. We highlight advantages compared to Gaussian processes through regression and classification experiments. Elliptical processes can supersede Gaussian processes in several settings, including cases where the likelihood is non-Gaussian or when accurate tail modeling is essential.

TMLR Journal 2022 Journal Article

Incorporating Sum Constraints into Multitask Gaussian Processes

  • Philipp Pilar
  • Carl Jidling
  • Thomas B. Schön
  • Niklas Wahlström

Machine learning models can be improved by adapting them to respect existing background knowledge. In this paper we consider multitask Gaussian processes, with background knowledge in the form of constraints that require a specific sum of the outputs to be constant. This is achieved by conditioning the prior distribution on the constraint fulfillment. The approach allows for both linear and nonlinear constraints. We demonstrate that the constraints are fulfilled with high precision and that the construction can improve the overall prediction accuracy as compared to the standard Gaussian process.

ICML Conference 2019 Conference Paper

Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding

  • Muhammad Osama 0001
  • Dave Zachariah
  • Thomas B. Schön

We address the problem of inferring the causal effect of an exposure on an outcome across space, using observational data. The data is possibly subject to unmeasured confounding variables which, in a standard approach, must be adjusted for by estimating a nuisance function. Here we develop a method that eliminates the nuisance function, while mitigating the resulting errors-in-variables. The result is a robust and accurate inference method for spatially varying heterogeneous causal effects. The properties of the method are demonstrated on synthetic as well as real data from Germany and the US.

UAI Conference 2019 Conference Paper

Probabilistic Programming for Birth-Death Models of Evolution Using an Alive Particle Filter with Delayed Sampling

  • Jan Kudlicka
  • Lawrence M. Murray
  • Fredrik Ronquist
  • Thomas B. Schön

We consider probabilistic programming for birth-death models of evolution and introduce a new widely-applicable inference method that combines an extension of the alive particle filter (APF) with automatic Rao-Blackwellization via delayed sampling. Birth-death models of evolution are an important family of phylogenetic models of the diversification processes that lead to evolutionary trees. Probabilistic programming languages (PPLs) give phylogeneticists a new and exciting tool: their models can be implemented as probabilistic programs with just a basic knowledge of programming. The general inference methods in PPLs reduce the need for external experts, allow quick prototyping and testing, and accelerate the development and deployment of new models. We show how these birth-death models can be implemented as simple programs in existing PPLs, and demonstrate the usefulness of the proposed inference method for such models. For the popular BiSSE model the method yields an increase of the effective sample size and the conditional acceptance rate by a factor of 30 in comparison with a standard bootstrap particle filter. Although concentrating on phylogenetics, the extended APF is a general inference method that shows its strength in situations where particles are often assigned zero weight. In the case when the weights are always positive, the extra cost of using the APF rather than the bootstrap particle filter is negligible, making our method a suitable drop-in replacement for the bootstrap particle filter in probabilistic programming inference.

ICML Conference 2018 Conference Paper

Learning Localized Spatio-Temporal Models From Streaming Data

  • Muhammad Osama 0001
  • Dave Zachariah
  • Thomas B. Schön

We address the problem of predicting spatio-temporal processes with temporal patterns that vary across spatial regions, when data is obtained as a stream. That is, when the training dataset is augmented sequentially. Specifically, we develop a localized spatio-temporal covariance model of the process that can capture spatially varying temporal periodicities in the data. We then apply a covariance-fitting methodology to learn the model parameters which yields a predictor that can be updated sequentially with each new data point. The proposed method is evaluated using both synthetic and real climate data which demonstrate its ability to accurately predict data missing in spatial regions over time.

ICML Conference 2015 Conference Paper

Nested Sequential Monte Carlo Methods

  • Christian A. Naesseth
  • Fredrik Lindsten
  • Thomas B. Schön

We propose nested sequential Monte Carlo (NSMC), a methodology to sample from sequences of probability distributions, even where the random variables are high-dimensional. NSMC generalises the SMC framework by requiring only approximate, properly weighted, samples from the SMC proposal distribution, while still resulting in a correct SMC algorithm. Furthermore, NSMC can in itself be used to produce such properly weighted samples. Consequently, one NSMC sampler can be used to construct an efficient high-dimensional proposal distribution for another NSMC sampler, and this nesting of the algorithm can be done to an arbitrary degree. This allows us to consider complex and high-dimensional models using SMC. We show results that motivate the efficacy of our approach on several filtering problems with dimensions in the order of 100 to 1000.

JMLR Journal 2014 Journal Article

Particle Gibbs with Ancestor Sampling

  • Fredrik Lindsten
  • Michael I. Jordan
  • Thomas B. Schön

Particle Markov chain Monte Carlo (pmcmc) is a systematic way of combining the two main tools used for Monte Carlo statistical inference: sequential Monte Carlo (smc) and Markov chain Monte Carlo (mcmc). We present a new pmcmc algorithm that we refer to as particle Gibbs with ancestor sampling (pgas). pgas provides the data analyst with an off-the-shelf class of Markov kernels that can be used to simulate, for instance, the typically high-dimensional and highly autocorrelated state trajectory in a state-space model. The ancestor sampling procedure enables fast mixing of the pgas kernel even when using seemingly few particles in the underlying smc sampler. This is important as it can significantly reduce the computational burden that is typically associated with using smc. pgas is conceptually similar to the existing pg with backward simulation (pgbs) procedure. Instead of using separate forward and backward sweeps as in pgbs, however, we achieve the same effect in a single forward sweep. This makes pgas well suited for addressing inference problems not only in state-space models, but also in models with more complex dependencies, such as non-Markovian, Bayesian nonparametric, and general probabilistic graphical models. [abs] [ pdf ][ bib ] &copy JMLR 2014. ( edit, beta )

ICRA Conference 2010 Conference Paper

Geo-referencing for UAV navigation using environmental classification

  • Fredrik Lindsten
  • Jonas Callmer
  • Henrik Ohlsson
  • David Törnqvist
  • Thomas B. Schön
  • Fredrik Gustafsson

A UAV navigation system relying on GPS is vulnerable to signal failure, making a drift free backup system necessary. We introduce a vision based geo-referencing system that uses pre-existing maps to reduce the long term drift. The system classifies an image according to its environmental content and thereafter matches it to an environmentally classified map over the operational area. This map matching provides a measurement of the absolute location of the UAV, that can easily be incorporated into a sensor fusion framework. Experiments show that the geo-referencing system reduces the long term drift in UAV navigation, enhancing the ability of the UAV to navigate accurately over large areas without the use of GPS.

IROS Conference 2010 Conference Paper

Learning to close the loop from 3D point clouds

  • Karl Granström
  • Thomas B. Schön

This paper presents a new solution to the loop closing problem for 3D point clouds. Loop closing is the problem of detecting the return to a previously visited location, and constitutes an important part of the solution to the Simultaneous Localisation and Mapping (SLAM) problem. It is important to achieve a low level of false alarms, since closing a false loop can have disastrous effects in a SLAM algorithm. In this work, the point clouds are described using features, which efficiently reduces the dimension of the data by a factor of 300 or more. The machine learning algorithm AdaBoost is used to learn a classifier from the features. All features are invariant to rotation, resulting in a classifier that is invariant to rotation. The presented method does neither rely on the discretisation of 3D space, nor on the extraction of lines, corners or planes. The classifier is extensively evaluated on publicly available outdoor and indoor data, and is shown to be able to robustly and accurately determine whether a pair of point clouds is from the same location or not. Experiments show detection rates of 63% for outdoor and 53% for indoor data at a false alarm rate of 0%. Furthermore, the classifier is shown to generalise well when trained on outdoor data and tested on indoor data in a SLAM experiment.