Learning Powerful Policies by Using Consistent Dynamics Model

Shagun Sodhani; Anirudh Goyal; Yoshua Bengio; Sergey Levine; Jian Tang

RLDM 2019

Learning Powerful Policies by Using Consistent Dynamics Model

Conference Abstract Accepted abstract Artificial Intelligence · Decision Making · Machine Learning · Reinforcement Learning

PDF Details

Abstract

Model-based Reinforcement Learning approaches have the promise of being sample efficient. Much of the progress in learning dynamics models in RL has been made by learning models via supervised learning. There is enough evidence that humans build a model of the environment, not only by observing the environment but also by interacting with the environment. Interaction with the environment allows humans to carry out “experiments”: taking actions that help uncover true causal relationships which can be used for building better dynamics models. Analogously, we would expect such interactions to be helpful for a learning agent while learning to model the environment dynamics. In this paper, we build upon this intuition, by using an auxiliary cost function to ensure consistency between what the agent observes (by acting in the real world) and what it imagines (by acting in the “learned” world). We consider several tasks - Mujoco based control tasks and Atari games - and show that the proposed approach helps to train powerful policies and better dynamics models.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Multidisciplinary Conference on Reinforcement Learning and Decision Making
Archive span: 2013-2025
Indexed papers: 1004
Paper id: 247371459653723058