Arrow Research search

Author name cluster

Lukas Schneider

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
2 author rows

Possible papers

4

ICML Conference 2025 Conference Paper

Monte-Carlo Tree Search with Uncertainty Propagation via Optimal Transport

  • Tuan Dam
  • Pascal Stenger
  • Lukas Schneider
  • Joni Pajarinen
  • Carlo D'Eramo
  • Odalric-Ambrym Maillard

This paper introduces a novel backup strategy for Monte-Carlo Tree Search (MCTS) tailored for highly stochastic and partially observable Markov decision processes. We adopt a probabilistic approach, modeling both value and action-value nodes as Gaussian distributions, to introduce a novel backup operator that computes value nodes as the Wasserstein barycenter of their action-value children nodes; thus, propagating the uncertainty of the estimate across the tree to the root node. We study our novel backup operator when using a novel combination of $L^1$-Wasserstein barycenter with $\alpha$-divergence, by drawing a crucial connection to the generalized mean backup operator. We complement our probabilistic backup operator with two sampling strategies, based on optimistic selection and Thompson sampling, obtaining our Wasserstein MCTS algorithm. We provide theoretical guarantees of asymptotic convergence of $\mathcal{O}(n^{-1/2})$, with $n$ as the number of visited trajectories, to the optimal policy and an empirical evaluation on several stochastic and partially observable environments, where our approach outperforms well-known related baselines.

ICRA Conference 2024 Conference Paper

Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

  • Lukas Schneider
  • Jonas Frey
  • Takahiro Miki
  • Marco Hutter 0001

Deployment in hazardous environments requires robots to understand the risks associated with their actions and movements to prevent accidents. Despite its importance, these risks are not explicitly modeled by currently deployed locomotion controllers for legged robots. In this work, we propose a risk sensitive locomotion training method employing distributional reinforcement learning to consider safety explicitly. Instead of relying on a value expectation, we estimate the complete value distribution to account for uncertainty in the robot’s interaction with the environment. The value distribution is consumed by a risk metric to extract risk sensitive value estimates. These are integrated into Proximal Policy Optimization (PPO) to derive our method, Distributional Proximal Policy Optimization (DPPO). The risk preference, ranging from risk-averse to risk-seeking, can be controlled by a single parameter, which enables to adjust the robot’s behavior dynamically. Importantly, our approach removes the need for additional reward function tuning to achieve risk sensitivity. We show emergent risk sensitive locomotion behavior in simulation and on the quadrupedal robot ANYmal. Videos of the experiments and code are available at https://sites.google.com/leggedrobotics.com/risk-aware-locomotion.

NeurIPS Conference 2022 Conference Paper

Structural Knowledge Distillation for Object Detection

  • Philip de Rijk
  • Lukas Schneider
  • Marius Cordts
  • Dariu Gavrila

Knowledge Distillation (KD) is a well-known training paradigm in deep neural networks where knowledge acquired by a large teacher model is transferred to a small student. KD has proven to be an effective technique to significantly improve the student's performance for various tasks including object detection. As such, KD techniques mostly rely on guidance at the intermediate feature level, which is typically implemented by minimizing an $\ell_{p}$-norm distance between teacher and student activations during training. In this paper, we propose a replacement for the pixel-wise independent $\ell_{p}$-norm based on the structural similarity (SSIM). By taking into account additional contrast and structural cues, more information within intermediate feature maps can be preserved. Extensive experiments on MSCOCO demonstrate the effectiveness of our method across different training schemes and architectures. Our method adds only little computational overhead, is straightforward to implement and at the same time it significantly outperforms the standard $\ell_p$-norms. Moreover, more complex state-of-the-art KD methods using attention-based sampling mechanisms are outperformed, including a +3. 5 AP gain using a Faster R-CNN R-50 compared to a vanilla model.

YNICL Journal 2020 Journal Article

Electrophysiological resting state networks of predominantly akinetic-rigid Parkinson patients: Effects of dopamine therapy

  • Lukas Schneider
  • Valentin Seeger
  • Lars Timmermann
  • Esther Florin

Parkinson's disease (PD) causes both motor and non-motor symptoms, which can partially be reversed by dopamine therapy. These symptoms as well as the effect of dopamine may be explained by distinct alterations in whole-brain architecture. We used functional connectivity (FC) and in particular resting state network (RSN) analysis to identify such whole-brain alterations in a frequency-specific manner. In addition, we hypothesized that standard dopaminergic medication would have a normalizing effect on these whole brain alterations. We recorded resting-state EEGs of 19 PD patients in the medical OFF and ON states, and of 12 healthy age-matched controls. The PD patients were either of akinetic-rigid or mixed subtype. We extracted RSNs with independent component analysis in the source space for five frequency bands. Within the sensorimotor network (SMN) the supplementary motor area (SMA) showed decreased FC in the OFF state compared to healthy controls. This finding was reversed after dopamine administration. Furthermore, in the OFF state no stable SMN beta component could be identified. The default mode network showed alterations due to PD independent of the medication state. The visual network was altered in the OFF state, and reinstated to a pattern similar to healthy controls by medication. In conclusion, PD causes distinct RSN alterations, which are partly reversed after levodopa administration. The changes within the SMN are of particular interest, because they broaden the pathophysiological understanding of PD. Our results identify the SMA as a central network hub affected in PD and a crucial effector of dopamine therapy.