Author name cluster

Lars Schmidt-Thieme

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

AAAI Conference 2025 Conference Paper

Motif-aware Graph Neural Networks for Networked Time Series Imputation

Nourhan Ahmed
Vijaya Krishna Yalavarthi
Lars Schmidt-Thieme

Networked time series are time series on a graph, one for each node, with applications in traffic and weather monitoring. Graph neural networks are natural candidates for networked time series imputation and have recently outperformed existing alternatives such as recurrent and generative models for time series imputation as they utilize a relational inductive bias for imputation. However, existing GNN-based approaches fail to capture the higher-order topological structure between sensors, which are shaped by recurring substructures in the graph, referred to as temporal motifs. In addition, it remains uncertain which motifs are the most pivotal motifs guiding the imputation task in networked time series. In this paper, we fill in this gap by proposing a graph neural network designed to leverage motif structures within the network by employing weighted motif adjacency matrices to capture higher-order neighborhood information. In particular, (1) we design a motif-wise multi-view attention module that explicitly captures various higher-order structures along with an attention mechanism that automatically assigns high weights to informative ones in order to maximize the use of higher-order information. (2) We introduce a gated fusion module by merging gated recurrent networks and graph convolutional networks to capture the spatial and temporal dependency in order to reflect the intricate impacts of temporal and spatial influence. Experimental results demonstrate that when compared to state-of-the-art models for time-series imputation tasks, our proposed model can reduce the error by around 19%.

PDF Details DOI

ICLR Conference 2025 Conference Paper

Physiome-ODE: A Benchmark for Irregularly Sampled Multivariate Time-Series Forecasting Based on Biological ODEs

Christian Klötergens
Vijaya Krishna Yalavarthi
Randolf Scholz
Maximilian Stubbemann
Stefan Born
Lars Schmidt-Thieme

State-of-the-art methods for forecasting irregularly sampled time series with missing values predominantly rely on just four datasets and a few small toy examples for evaluation. While ordinary differential equations (ODE) are the prevalent models in science and engineering, a baseline model that forecasts a constant value outperforms ODE-based models from the last five years on three of these existing datasets. This unintuitive finding hampers further research on ODE-based models, a more plausible model family. In this paper, we develop a methodology to generate irregularly sampled multivariate time series (IMTS) datasets from ordinary differential equations and to select challenging instances via rejection sampling. Using this methodology, we create Physiome-ODE, a large and sophisticated benchmark of IMTS datasets consisting of 50 individual datasets, derived from real-world ordinary differential equations from research in biology. Physiome-ODE is the first benchmark for IMTS forecasting that we are aware of and an order of magnitude larger than the current evaluation setting of four datasets. Using our benchmark Physiome-ODE, we show qualitatively completely different results than those derived from the current four datasets: on Physiome-ODE ODE-based models can play to their strength and our benchmark can differentiate in a meaningful way between different IMTS forecasting models. This way, we expect to give a new impulse to research on ODE-based time series modeling.

Details

AAAI Conference 2025 Conference Paper

Probabilistic Forecasting of Irregularly Sampled Time Series with Missing Values via Conditional Normalizing Flows

Vijaya Krishna Yalavarthi
Randolf Scholz
Stefan Born
Lars Schmidt-Thieme

Probabilistic forecasting of irregularly sampled multivariate time series with missing values is crucial for decision-making in various domains, including health care, astronomy, and climate. State-of-the-art methods estimate only marginal distributions of observations in single channels and at single timepoints, assuming a Gaussian distribution for the data. In this work, we propose a novel model, ProFITi using conditional normalizing flows to learn multivariate conditional distribution: joint distribution of the future values of the time series conditioned on past observations and specific channels and timepoints, without assuming any fixed shape of the underlying distribution. As model components, we introduce a novel invertible triangular attention layer and an invertible non-linear activation function on and onto the whole real line. Through extensive experiments on 4 real-world datasets, ProFITi demonstrates significant improvement, achieving an average log-likelihood gain of 2.0 compared to the previous state-of-the-art method.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Robust Hyperbolic Learning with Curvature-Aware Optimization

Ahmad Bdeir
Johannes Burchert
Lars Schmidt-Thieme
Niels Landwehr

Hyperbolic deep learning has become a growing research direction in computer vision due to the unique properties afforded by the alternate embedding space. The negative curvature and exponentially growing distance metric provide a natural framework for capturing hierarchical relationships between datapoints and allowing for finer separability between their embeddings. However, current hyperbolic learning approaches are still prone to overfitting, computationally expensive, and prone to instability, especially when attempting to learn the manifold curvature to adapt to tasks and different datasets. To address these issues, our paper presents a derivation for Riemannian AdamW that helps increase hyperbolic generalization ability. For improved stability, we introduce a novel fine-tunable hyperbolic scaling approach to constrain hyperbolic embeddings and reduce approximation errors. Using this along with our curvature-aware learning schema for Riemannian Optimizers enables the combination of curvature and non-trivialized hyperbolic parameter learning. Our approach demonstrates consistent performance improvements across Computer Vision, EEG classification, and hierarchical metric learning tasks while greatly reducing runtime.

PDF Details

ECAI Conference 2025 Conference Paper

TabResFlow: A Normalizing Spline Flow Model for Probabilistic Univariate Tabular Regression

Kiran Madhusudhanan
Vijaya Krishna Yalavarthi
Jonas Sonntag
Maximilian Stubbemann
Lars Schmidt-Thieme

Tabular regression is a well-studied problem with numerous industrial applications, yet most existing approaches focus on point estimation, often leading to overconfident predictions. This issue is particularly critical in industrial automation, where trustworthy decision-making is essential. Probabilistic regression models address this challenge by modeling prediction uncertainty. However, many conventional methods assume a fixed-shape distribution (typically Gaussian), and resort to estimating distribution parameters. This assumption is often restrictive, as real-world target distributions can be highly complex. To overcome this limitation, we introduce TabResFlow, a Normalizing Spline Flow model designed specifically for univariate tabular regression, where commonly used simple flow networks like RealNVP and Masked Autoregressive Flow (MAF) are unsuitable. TabResFlow consists of three key components: (1) An MLP encoder for each numerical feature. (2) A fully connected ResNet backbone for expressive feature extraction. (3) A conditional spline-based normalizing flow for flexible and tractable density estimation. We evaluate TabResFlow on nine public benchmark datasets, demonstrating that it consistently outperforms existing probabilistic regression models on likelihood scores. Our results demonstrate 9. 64% improvement compared to the strongest probabilistic regression model (TreeFlow), and on average 5. 6 times speed-up in inference time compared to the strongest deep learning alternative (NodeFlow). Additionally, we validate the practical applicability of TabResFlow in a real-world used car price prediction task under selective regression. To measure performance in this setting, we introduce a novel Area Under Risk Coverage (AURC) metric and show that TabResFlow achieves superior results across this metric.

Details

NeurIPS Conference 2024 Conference Paper

A Cross-Domain Benchmark for Active Learning

Thorben Werner
Johannes Burchert
Maximilian Stubbemann
Lars Schmidt-Thieme

Active Learning (AL) deals with identifying the most informative samples forlabeling to reduce data annotation costs for supervised learning tasks. ALresearch suffers from the fact that lifts from literature generalize poorly andthat only a small number of repetitions of experiments are conducted. To overcomethese obstacles, we propose CDALBench, the first active learning benchmarkwhich includes tasks in computer vision, natural language processing and tabularlearning. Furthermore, by providing an efficient, greedy oracle, CDALBenchcan be evaluated with 50 runs for each experiment. We show, that both thecross-domain character and a large amount of repetitions are crucial forsophisticated evaluation of AL research. Concretely, we show that thesuperiority of specific methods varies over the different domains, making itimportant to evaluate Active Learning with a cross-domain benchmark. Additionally, we show that having a large amount of runs is crucial. With onlyconducting three runs as often done in the literature, the superiority ofspecific methods can strongly vary with the specific runs. This effect is so strong, that, depending on the seed, even a well-established method's performance can be significantly better and significantlyworse than random for the same dataset.

PDF Details DOI

AAAI Conference 2024 Conference Paper

GraFITi: Graphs for Forecasting Irregularly Sampled Time Series

Vijaya Krishna Yalavarthi
Kiran Madhusudhanan
Randolf Scholz
Nourhan Ahmed
Johannes Burchert
Shayan Jawed
Stefan Born
Lars Schmidt-Thieme

Forecasting irregularly sampled time series with missing values is a crucial task for numerous real-world applications such as healthcare, astronomy, and climate sciences. State-of-the-art approaches to this problem rely on Ordinary Differential Equations (ODEs) which are known to be slow and often require additional features to handle missing values. To address this issue, we propose a novel model using Graphs for Forecasting Irregularly Sampled Time Series with missing values which we call GraFITi. GraFITi first converts the time series to a Sparsity Structure Graph which is a sparse bipartite graph, and then reformulates the forecasting problem as the edge weight prediction task in the graph. It uses the power of Graph Neural Networks to learn the graph and predict the target edge weights. GraFITi has been tested on 3 real-world and 1 synthetic irregularly sampled time series dataset with missing values and compared with various state-of-the-art models. The experimental results demonstrate that GraFITi improves the forecasting accuracy by up to 17% and reduces the run time up to 5 times compared to the state-of-the-art forecasting models.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Neural Capacitated Clustering

Jonas K. Falkner
Lars Schmidt-Thieme

Recent work on deep clustering has found new promising methods also for constrained clustering problems. Their typically pairwise constraints often can be used to guide the partitioning of the data. Many problems however, feature cluster-level constraints, e. g. the Capacitated Clustering Problem (CCP), where each point has a weight and the total weight sum of all points in each cluster is bounded by a prescribed capacity. In this paper we propose a new method for the CCP, Neural Capacited Clustering, that learns a neural network to predict the assignment probabilities of points to cluster centers from a data set of optimal or near optimal past solutions of other problem instances. During inference, the resulting scores are then used in an iterative k-means like procedure to refine the assignment under capacity constraints. In our experiments on artificial data and two real world datasets our approach outperforms several state-of-the-art mathematical and heuristic solvers from the literature. Moreover, we apply our method in the context of a cluster-first-route-second approach to the Capacitated Vehicle Routing Problem (CVRP) and show competitive results on the well-known Uchoa benchmark.

PDF Details DOI

ICML Conference 2022 Conference Paper

Zero-shot AutoML with Pretrained Models

Ekrem Öztürk
Fabio Ferreira
Hadi S. Jomaa
Lars Schmidt-Thieme
Josif Grabocka
Frank Hutter

Given a new dataset D and a low compute budget, how should we choose a pre-trained model to fine-tune to D, and set the fine-tuning hyperparameters without risking overfitting, particularly if D is small? Here, we extend automated machine learning (AutoML) to best make these choices. Our domain-independent meta-learning approach learns a zero-shot surrogate model which, at test time, allows to select the right deep learning (DL) pipeline (including the pre-trained model and fine-tuning hyperparameters) for a new dataset D given only trivial meta-features describing D such as image resolution or the number of classes. To train this zero-shot model, we collect performance data for many DL pipelines on a large collection of datasets and meta-train on this data to minimize a pairwise ranking objective. We evaluate our approach under the strict time limit of the vision track of the ChaLearn AutoDL challenge benchmark, clearly outperforming all challenge contenders.

Details

AAAI Conference 2015 Conference Paper

Integration and Evaluation of a Matrix Factorization Sequencer in Large Commercial ITS

Carlotta Schatten
Ruth Janning
Lars Schmidt-Thieme

Correct evaluation of Machine Learning based sequencers require large data availability, large scale experiments and consideration of different evaluation measures. Such constraints make the construction of ad-hoc Intelligent Tutoring Systems (ITS) unfeasible and impose early integration in already existing ITS, which possesses a large amount of tasks to be sequenced. However, such systems were not designed to be combined with Machine Learning methods and require several adjustments. As a consequence more than a half of the components based on recommender technology are never evaluated with an online experiment. In this paper we show how we adapted a Matrix Factorization based performance predictor and a score based policy for task sequencing to be integrated in a commercial ITS with over 2000 tasks on 20 topics. We evaluated the experiment under different perspectives in comparison with the ITS sequencer designed by experts over the years. As a result we achieve same post-test results and outperform the current sequencer in the perceived experience questionnaire with almost no curriculum authoring effort. We also showed that the sequencer possess a better user modeling, better adapting to the knowledge acquisition rate of the students.

PDF Details

AAAI Conference 2012 Conference Paper

Classification of Sparse Time Series via Supervised Matrix Factorization

Josif Grabocka
Alexandros Nanopoulos
Lars Schmidt-Thieme

Data sparsity is an emerging real-world problem observed in a various domains ranging from sensor networks to medical diagnosis. Consecutively, numerous machine learning methods were modeled to treat missing values. Nevertheless, sparsity, defined as missing segments, has not been thoroughly investigated in the context of time-series classification. We propose a novel principle for classifying time series, which in contrast to existing approaches, avoids reconstructing the missing segments in time series and operates solely on the observed ones. Based on the proposed principle, we develop a method that prevents adding noise that incurs during the reconstruction of the original time series. Our method adapts supervised matrix factorization by projecting time series in a latent space through stochastic learning. Furthermore the projected data is built in a supervised fashion via a logistic regression. Abundant experiments on a large collection of 37 data sets demonstrate the superiority of our method, which in the majority of cases outperforms a set of baselines that do not follow our proposed principle.

PDF Details

UAI Conference 2009 Conference Paper

BPR: Bayesian Personalized Ranking from Implicit Feedback

Steffen Rendle
Christoph Freudenthaler
Zeno Gantner
Lars Schmidt-Thieme

Item recommendation is the task of predicting a personalized ranking on a set of items (e. g. websites, movies, products). In this paper, we investigate the most common scenario with implicit feedback (e. g. clicks, purchases). There are many methods for item recommendation from implicit feedback like matrix factorization (MF) or adaptive knearest-neighbor (kNN). Even though these methods are designed for the item prediction task of personalized ranking, none of them is directly optimized for ranking. In this paper we present a generic optimization criterion BPR-Opt for personalized ranking that is the maximum posterior estimator derived from a Bayesian analysis of the problem. We also provide a generic learning algorithm for optimizing models with respect to BPR-Opt. The learning method is based on stochastic gradient descent with bootstrap sampling. We show how to apply our method to two state-of-the-art recommender models: matrix factorization and adaptive kNN. Our experiments indicate that for the task of personalized ranking our optimization method outperforms the standard learning techniques for MF and kNN. The results show the importance of optimizing models for the right criterion.

Details

IS Journal 2007 Journal Article

Guest Editors' Introduction: Recommender Systems

Alexander Felfernig
Gerhard Friedrich
Lars Schmidt-Thieme

This special issue presents eight articles, five long and three short, on techniques to improve recommender systems. They cover improving such aspects as user interaction with recommenders, the quality of results presented to users, and user trust in presented recommendations. This article is part of a special issue on Recommender Systems.

Details DOI