Arrow Research search

Author name cluster

Leonhard Staut

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2020 Conference Paper

Beliefs We Can Believe in: Replacing Assumptions with Data in Real-Time Search

  • Maximilian Fickert
  • Tianyi Gu
  • Leonhard Staut
  • Wheeler Ruml
  • Joerg Hoffmann
  • Marek Petrik

Suboptimal heuristic search algorithms can benefit from reasoning about heuristic error, especially in a real-time setting where there is not enough time to search all the way to a goal. However, current reasoning methods implicitly or explicitly incorporate assumptions about the cost-to-go function. We consider a recent real-time search algorithm, called Nancy, that manipulates explicit beliefs about the cost-togo. The original presentation of Nancy assumed that these beliefs are Gaussian, with parameters following a certain form. In this paper, we explore how to replace these assumptions with actual data. We develop a data-driven variant of Nancy, DDNancy, that bases its beliefs on heuristic performance statistics from the same domain. We extend Nancy and DDNancy with the notion of persistence and prove their completeness. Experimental results show that DDNancy can perform well in domains in which the original assumption-based Nancy performs poorly.

PRL Workshop 2020 Workshop Paper

Real-time Planning as Data-driven Decision-making

  • Maximilian Fickert
  • Tianyi Gu
  • Leonhard Staut
  • Sai Lekyang
  • Wheeler Ruml
  • Joerg Hoffmann
  • Marek Petrik

If reinforcement learning (RL) is the use of incrementally gathered data to drive decision-making, then any heuristic search strategy is fundamentally an RL process. This is perhaps clearest in real-time planning, where an agent must select the next action to take within a fixed time bound. Even in deterministic domains, real-time action selection inherently suffers from uncertainty about those portions of the state space that have not yet been computed by the lookahead search. In this paper, we present new results in a line of research that explores how an agent can benefit from metareasoning about this uncertainty. Taking inspiration from prior work in distributional methods from RL, the Nancy search framework represents its uncertainty explicitly as beliefs over cost-to-go. Nancy then expands nodes so as to minimize the expected regret in case a non-optimal action is chosen. We present detailed results showing how beliefs can be informed by prior experience and we experimentally compare Nancy against both conventional real-time search algorithms like LSS-LRTA* and approaches from RL that exploit uncertainty, such as Monte Carlo tree search and Kaelbling’s interval estimation. We find that Nancy generally outperforms previous methods, particularly on more difficult problems. This work illustrates how the distributional perspective from Bayesian RL can be adapted to deterministic planning settings, and how deterministic planning can provide useful testbeds for methods that metareason about uncertainty during planning.