Optimism-driven exploration for nonlinear systems

Teodor Mihai Moldovan; Sergey Levine; Michael I. Jordan; Pieter Abbeel

Back to ICRA

ICRA 2015

Optimism-driven exploration for nonlinear systems

Conference Paper Accepted Paper Artificial Intelligence · Robotics

Details

Abstract

Tasks with unknown dynamics and costly system interaction time present a serious challenge for reinforcement learning. If a model of the dynamics can be learned quickly, interaction time can be reduced substantially. We show that combining an optimistic exploration strategy with model-predictive control can achieve very good sample complexity for a range of nonlinear systems. Our method learns a Dirichlet process mixture of linear models using an exploration strategy based on optimism in the face of uncertainty. Trajectory optimization is used to plan paths in the learned model that both minimize the cost and perform exploration. Experimental results show that our approach achieves some of the most sample-efficient learning rates on several benchmark problems, and is able to successfully learn to control a simulated helicopter during hover and autorotation with only seconds of interaction time. The computational requirements are substantial.

Authors

Keywords

Heuristic algorithms
Trajectory
Optimization
Computational modeling
Uncertainty
Least squares approximations
Nonlinear Systems
Optimism
Learning Models
Dynamic Model
Time Interaction
Mixture Model
Computational Requirements
Time In Seconds
Model Predictive Control
Trajectory Optimization
Exploration Strategy
Benchmark Problems
Unknown Dynamics
Dirichlet Process
Dirichlet Process Mixture
Optimization Problem
Computation Time
Cost Function
Optimal Control
Change Model
Model Predictive Control Algorithm
Pseudospectral Method
Model Uncertainty
Model-based Reinforcement Learning
Variational Inference
Adaptive Control
Goal State
True Dynamics
Gaussian Process
Continuous System

Context

Venue: IEEE International Conference on Robotics and Automation
Archive span: 1984-2025
Indexed papers: 30179
Paper id: 1024347781925580098