Author name cluster

Julien Schroeter

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

2 author rows

AAAI Conference 2021 Conference Paper

Learning Precise Temporal Point Event Detection with Misaligned Labels

Julien Schroeter
Kirill Sidorov
David Marshall

This work addresses the problem of robustly learning precise temporal point event detection despite only having access to poorly aligned labels for training. While standard (cross entropy-based) methods work well in noise-free setting, they often fail when labels are unreliable since they attempt to strictly fit the annotations. A common solution to this drawback is to transform the point prediction problem into a distribution prediction problem. However, we show that this approach raises several issues that negatively affect the robust learning of temporal localization. Thus, in an attempt to overcome these shortcomings, we introduce a simple and versatile training paradigm combining soft localization learning with counting-based sparsity regularization. In fact, unlike its counterparts, our approach allows to directly infer clear-cut point predictions in an end-to-end fashion while relaxing the reliance of the training on the exact position of labels. We achieve state-of-the-art performance against standard benchmarks in a number of challenging experiments (e. g. , detection of instantaneous events in videos and music transcription) by simply replacing the original loss function with our novel alternative—without any additional fine-tuning.

PDF Details

ICML Conference 2019 Conference Paper

Weakly-Supervised Temporal Localization via Occurrence Count Learning

Julien Schroeter
Kirill A. Sidorov
A. David Marshall

We propose a novel model for temporal detection and localization which allows the training of deep neural networks using only counts of event occurrences as training labels. This powerful weakly-supervised framework alleviates the burden of the imprecise and time consuming process of annotating event locations in temporal data. Unlike existing methods, in which localization is explicitly achieved by design, our model learns localization implicitly as a byproduct of learning to count instances. This unique feature is a direct consequence of the model’s theoretical properties. We validate the effectiveness of our approach in a number of experiments (drum hit and piano onset detection in audio, digit detection in images) and demonstrate performance comparable to that of fully-supervised state-of-the-art methods, despite much weaker training requirements.

Details