Arrow Research search
Back to PRL

PRL 2020

Real-time Planning as Data-driven Decision-making

Workshop Paper accepted submission Artificial Intelligence · Automated Planning · Reinforcement Learning

Abstract

If reinforcement learning (RL) is the use of incrementally gathered data to drive decision-making, then any heuristic search strategy is fundamentally an RL process. This is perhaps clearest in real-time planning, where an agent must select the next action to take within a fixed time bound. Even in deterministic domains, real-time action selection inherently suffers from uncertainty about those portions of the state space that have not yet been computed by the lookahead search. In this paper, we present new results in a line of research that explores how an agent can benefit from metareasoning about this uncertainty. Taking inspiration from prior work in distributional methods from RL, the Nancy search framework represents its uncertainty explicitly as beliefs over cost-to-go. Nancy then expands nodes so as to minimize the expected regret in case a non-optimal action is chosen. We present detailed results showing how beliefs can be informed by prior experience and we experimentally compare Nancy against both conventional real-time search algorithms like LSS-LRTA* and approaches from RL that exploit uncertainty, such as Monte Carlo tree search and Kaelbling’s interval estimation. We find that Nancy generally outperforms previous methods, particularly on more difficult problems. This work illustrates how the distributional perspective from Bayesian RL can be adapted to deterministic planning settings, and how deterministic planning can provide useful testbeds for methods that metareason about uncertainty during planning.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
Bridging the Gap Between AI Planning and Reinforcement Learning
Archive span
2020-2025
Indexed papers
151
Paper id
833267342313017106