Arrow Research search

Author name cluster

Tudor Berariu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
2 author rows

Possible papers

3

NeurIPS Conference 2025 Conference Paper

Incremental Sequence Classification with Temporal Consistency

  • Lucas Maystre
  • Gabriel Barello
  • Tudor Berariu
  • Cambray Cambray
  • Rares Dolga
  • Alvaro Ortega Gonzalez
  • Andrei Nica
  • David Barber

We address the problem of incremental sequence classification, where predictions are updated as new elements in the sequence are revealed. Drawing on temporal-difference learning from reinforcement learning, we identify a temporal-consistency condition that successive predictions should satisfy. We leverage this condition to develop a novel loss function for training incremental sequence classifiers. Through a concrete example, we demonstrate that optimizing this loss can offer substantial gains in data efficiency. We apply our method to text classification tasks and show that it improves predictive accuracy over competing approaches on several benchmark datasets. We further evaluate our approach on the task of verifying large language model generations for correctness in grade-school math problems. Our results show that models trained with our method are better able to distinguish promising generations from unpromising ones after observing only a few tokens.

IJCAI Conference 2024 Conference Paper

Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search

  • Abbas Mehrabian
  • Ankit Anand
  • Hyunjik Kim
  • Nicolas Sonnerat
  • Matej Balog
  • Gheorghe Comanici
  • Tudor Berariu
  • Andrew Lee

This work proposes a new learning-to-search benchmark and uses AI to discover new mathematical knowledge related to an open conjecture of Erdos (1975) in extremal graph theory. The problem is to find graphs with a given size (number of nodes) that maximize the number of edges without having 3- or 4-cycles. We formulate this as a sequential decision-making problem and compare AlphaZero, a neural network-guided tree search, with tabu search, a heuristic local search method. Using either method, by introducing a curriculum---jump-starting the search for larger graphs using good graphs found at smaller sizes---we improve the state-of-the-art lower bounds for several sizes. We also propose a flexible graph-generation environment and a permutation-invariant network architecture for learning to search in the space of graphs.

ICML Conference 2021 Conference Paper

Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective

  • Florin Gogianu
  • Tudor Berariu
  • Mihaela Rosca
  • Claudia Clopath
  • Lucian Busoniu
  • Razvan Pascanu

Most of the recent deep reinforcement learning advances take an RL-centric perspective and focus on refinements of the training objective. We diverge from this view and show we can recover the performance of these developments not by changing the objective, but by regularising the value-function estimator. Constraining the Lipschitz constant of a single layer using spectral normalisation is sufficient to elevate the performance of a Categorical-DQN agent to that of a more elaborated agent on the challenging Atari domain. We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics and show that is sufficient to modulate the parameter updates to recover most of the performance of spectral normalisation. These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.