PAC-MDP Learning with Knowledge-based Admissible Models

Marek Grzes; Daniel Kudenko

Back to AAMAS

AAMAS 2010

PAC-MDP Learning with Knowledge-based Admissible Models

Conference Paper Session 6 - Learning I Autonomous Agents and Multiagent Systems

PDF

Abstract

PAC-MDP algorithms approach the exploration-exploitation problem of reinforcement learning agents in an effective way whichguarantees that with high probability, the algorithm performs nearoptimally for all but a polynomial number of steps. The performance of these algorithms can be further improved by incorporating domain knowledge to guide their learning process. In this paper we propose a framework to use partial knowledge about effectsof actions in a theoretically well-founded way. Empirical evaluation shows that our proposed method is more efficient than reward shaping which represents an alternative approach to incorporate background knowledge. Our solution is also very competitivewhen compared with the Bayesian Exploration Bonus (BEB) algorithm. BEB is not PAC-MDP, however it can exploit domainknowledge via informative priors. We show how to use the samekind of knowledge in the PAC-MDP framework in a way whichpreserves all theoretical guarantees of PAC-MDP learning.

Authors

Keywords

Domain knowledge
Heuristics
Reinforcement learning

Context

Venue: International Conference on Autonomous Agents and Multiagent Systems
Archive span: 2002-2025
Indexed papers: 7403
Paper id: 1024276897515076554