Arrow Research search
Back to RLDM

RLDM 2013

Towards a practical Bayes-optimal agent

Conference Abstract Accepted abstract Artificial Intelligence · Decision Making · Machine Learning · Reinforcement Learning

Abstract

Only rich and sophisticated statistical models are adequate for agents that must learn to navi- gate complex environments. However, it has not been clear how methods for planning can take advantage of models, such as those incorporating Bayesian non-parametric devices, that are sufficiently intricate as to demand approximate sampling schemes. We show that Bayes-Adaptive planning can be combined in a principled way with approximate sampling, and demonstrate the power of the resulting method in a chal- lenging task involving safe exploration which defeats myopic methods such as Thompson Sampling. This highlights the importance of propagating beliefs in realistic cases involving trade-offs between exploration and exploitation. The next challenge is to employ function approximation to represent the belief-state value to improve search efficiency further and thus enable longer search horizons.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
Multidisciplinary Conference on Reinforcement Learning and Decision Making
Archive span
2013-2025
Indexed papers
1004
Paper id
822117428544975609