AAMAS 2016
Boosting Nonparametric Policies
Abstract
Learning complex policies is a key step toward real-world applications of reinforcement learning. While boosting approaches have been widely applied in state-of-the-art supervised learning techniques to adaptively learn nonparametric functions, in reinforcement learning the boosting-style approaches have been little investigated. Only a few pieces of previous work explored this direction, however theoretical properties are still unclear and empirical performance is quite limited. In this paper, we propose the PolicyBoost method. It optimizes a finite-sample objective function, which leads to maximization of the expected total reward, by employing the GradientBoost approach. Experimental results verify the effectiveness as well as the robustness of PolicyBoost, even without feature engineering.
Authors
Keywords
Context
- Venue
- International Conference on Autonomous Agents and Multiagent Systems
- Archive span
- 2002-2025
- Indexed papers
- 7403
- Paper id
- 221865722652559729