Boosting Nonparametric Policies

Yang Yu; Peng-Fei Hou; Qing Da; Yu Qian

Back to AAMAS

AAMAS 2016

Boosting Nonparametric Policies

Conference Paper Learning II Autonomous Agents and Multiagent Systems

PDF

Abstract

Learning complex policies is a key step toward real-world applications of reinforcement learning. While boosting approaches have been widely applied in state-of-the-art supervised learning techniques to adaptively learn nonparametric functions, in reinforcement learning the boosting-style approaches have been little investigated. Only a few pieces of previous work explored this direction, however theoretical properties are still unclear and empirical performance is quite limited. In this paper, we propose the PolicyBoost method. It optimizes a finite-sample objective function, which leads to maximization of the expected total reward, by employing the GradientBoost approach. Experimental results verify the effectiveness as well as the robustness of PolicyBoost, even without feature engineering.

Authors

Keywords

Policy gradient
Boosting
nonparametric model

Context

Venue: International Conference on Autonomous Agents and Multiagent Systems
Archive span: 2002-2025
Indexed papers: 7403
Paper id: 221865722652559729