Efficient PAC Reinforcement Learning in Regular Decision Processes

Alessandro Ronca; Giuseppe De Giacomo

Back to PRL

PRL 2021

Efficient PAC Reinforcement Learning in Regular Decision Processes

Workshop Paper accepted submission Artificial Intelligence · Automated Planning · Reinforcement Learning

PDF Details

Abstract

Recently regular decision processes have been proposed as a well-behaved form of non-Markov decision process. Regular decision processes are characterised by a transition function and a reward function that depend on the whole history, though regularly (as in regular languages). In practice both the transition and the reward functions can be seen as finite transducers. We study reinforcement learning in regular decision processes. Our main contribution is to show that a near-optimal policy can be PAC-learned in polynomial time in a set of parameters that describe the underlying decision process. We argue that the identified set of parameters is minimal and it reasonably captures the difficulty of a regular decision process.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Bridging the Gap Between AI Planning and Reinforcement Learning
Archive span: 2020-2025
Indexed papers: 151
Paper id: 47006327021969949