Reinforcement Learning with Derivative-Free Exploration

Xiong-Hui Chen; Yang Yu

Back to AAMAS

AAMAS 2019

Reinforcement Learning with Derivative-Free Exploration

Conference Paper Extended Abstracts Autonomous Agents and Multiagent Systems

PDF

Abstract

Effective exploration is key to sample-efficient reinforcement learning. While the most popular general approaches (e. g. , ϵ-greedy) for exploration are still of low efficiency, derivative-free optimization also invents efficient ways of exploration for better global search, which reinforcement learning usually desires for. In this paper, we introduce a derivative-free based exploration called DFE as a general efficient exploration method for early-stage reinforcement learning. DFE overcomes the disadvantage of optimization inefficiency and pool scalability in pure derivative-free optimization based reinforcement learning methods. Our experiments show DFE is an efficient and general exploration method through exploring trajectories with DFE in deterministic off-policy method DDPG and stochastic off-policy method ACER algorithms, and applying in Atari and Mujoco, which represent a high-dimensional discreteaction environment and a continuous control environment.

Authors

Keywords

reinforcement learning
derivative-free optimization
exploration

Context

Venue: International Conference on Autonomous Agents and Multiagent Systems
Archive span: 2002-2025
Indexed papers: 7403
Paper id: 704038236546397306