Arrow Research search
Back to ICRA

ICRA 2015

Optimism-driven exploration for nonlinear systems

Conference Paper Accepted Paper Artificial Intelligence ยท Robotics

Abstract

Tasks with unknown dynamics and costly system interaction time present a serious challenge for reinforcement learning. If a model of the dynamics can be learned quickly, interaction time can be reduced substantially. We show that combining an optimistic exploration strategy with model-predictive control can achieve very good sample complexity for a range of nonlinear systems. Our method learns a Dirichlet process mixture of linear models using an exploration strategy based on optimism in the face of uncertainty. Trajectory optimization is used to plan paths in the learned model that both minimize the cost and perform exploration. Experimental results show that our approach achieves some of the most sample-efficient learning rates on several benchmark problems, and is able to successfully learn to control a simulated helicopter during hover and autorotation with only seconds of interaction time. The computational requirements are substantial.

Authors

Keywords

  • Heuristic algorithms
  • Trajectory
  • Optimization
  • Computational modeling
  • Uncertainty
  • Least squares approximations
  • Nonlinear Systems
  • Optimism
  • Learning Models
  • Dynamic Model
  • Time Interaction
  • Mixture Model
  • Computational Requirements
  • Time In Seconds
  • Model Predictive Control
  • Trajectory Optimization
  • Exploration Strategy
  • Benchmark Problems
  • Unknown Dynamics
  • Dirichlet Process
  • Dirichlet Process Mixture
  • Optimization Problem
  • Computation Time
  • Cost Function
  • Optimal Control
  • Change Model
  • Model Predictive Control Algorithm
  • Pseudospectral Method
  • Model Uncertainty
  • Model-based Reinforcement Learning
  • Variational Inference
  • Adaptive Control
  • Goal State
  • True Dynamics
  • Gaussian Process
  • Continuous System

Context

Venue
IEEE International Conference on Robotics and Automation
Archive span
1984-2025
Indexed papers
30179
Paper id
1024347781925580098