Can Learned Optimization Make Reinforcement Learning Less Difficult?

Alexander D. Goldie; Chris Lu; Matthew T. Jackson; Shimon Whiteson; Jakob N. Foerster

doi:10.52202/079017-0177

Back to NeurIPS

NeurIPS 2024

Can Learned Optimization Make Reinforcement Learning Less Difficult?

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details DOI

Abstract

While reinforcement learning (RL) holds great potential for decision making in the real world, it suffers from a number of unique difficulties which often need specific consideration. In particular: it is highly non-stationary; suffers from high degrees of plasticity loss; and requires exploration to prevent premature convergence to local optima and maximize return. In this paper, we consider whether learned optimization can help overcome these problems. Our method, Learned O ptimization for P lasticity, E xploration and N on-stationarity ( OPEN ), meta-learns an update rule whose input features and output structure are informed by previously proposed solutions to these difficulties. We show that our parameterization is flexible enough to enable meta-learning in diverse learning contexts, including the ability to use stochasticity for exploration. Our experiments demonstrate that when meta-trained on single and small sets of environments, OPEN outperforms or equals traditionally used optimizers. Furthermore, OPEN shows strong generalization characteristics across a range of environments and agent architectures.

Can Learned Optimization Make Reinforcement Learning Less Difficult?

Abstract

Authors

Keywords

Context