RLDM 2013
Towards a practical Bayes-optimal agent
Abstract
Only rich and sophisticated statistical models are adequate for agents that must learn to navi- gate complex environments. However, it has not been clear how methods for planning can take advantage of models, such as those incorporating Bayesian non-parametric devices, that are sufficiently intricate as to demand approximate sampling schemes. We show that Bayes-Adaptive planning can be combined in a principled way with approximate sampling, and demonstrate the power of the resulting method in a chal- lenging task involving safe exploration which defeats myopic methods such as Thompson Sampling. This highlights the importance of propagating beliefs in realistic cases involving trade-offs between exploration and exploitation. The next challenge is to employ function approximation to represent the belief-state value to improve search efficiency further and thus enable longer search horizons.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- Multidisciplinary Conference on Reinforcement Learning and Decision Making
- Archive span
- 2013-2025
- Indexed papers
- 1004
- Paper id
- 822117428544975609