Arrow Research search
Back to AAMAS

AAMAS 2016

Boosting Nonparametric Policies

Conference Paper Learning II Autonomous Agents and Multiagent Systems

Abstract

Learning complex policies is a key step toward real-world applications of reinforcement learning. While boosting approaches have been widely applied in state-of-the-art supervised learning techniques to adaptively learn nonparametric functions, in reinforcement learning the boosting-style approaches have been little investigated. Only a few pieces of previous work explored this direction, however theoretical properties are still unclear and empirical performance is quite limited. In this paper, we propose the PolicyBoost method. It optimizes a finite-sample objective function, which leads to maximization of the expected total reward, by employing the GradientBoost approach. Experimental results verify the effectiveness as well as the robustness of PolicyBoost, even without feature engineering.

Authors

Keywords

  • Policy gradient
  • Boosting
  • nonparametric model

Context

Venue
International Conference on Autonomous Agents and Multiagent Systems
Archive span
2002-2025
Indexed papers
7403
Paper id
221865722652559729