Arrow Research search
Back to IROS

IROS 2015

Learning compound multi-step controllers under unknown dynamics

Conference Paper Accepted Paper Artificial Intelligence ยท Robotics

Abstract

Applications of reinforcement learning for robotic manipulation often assume an episodic setting. However, controllers trained with reinforcement learning are often situated in the context of a more complex compound task, where multiple controllers might be invoked in sequence to accomplish a higher-level goal. Furthermore, training such controllers typically requires resetting the environment between episodes, which is typically handled manually. We describe an approach for training chains of controllers with reinforcement learning. This requires taking into account the state distributions induced by preceding controllers in the chain, as well as automatically training reset controllers that can reset the task between episodes. The initial state of each controller is determined by the controller that precedes it, resulting in a non-stationary learning problem. We demonstrate that a recently developed method that optimizes linear-Gaussian controllers under learned local linear models can tackle this sort of non-stationary problem, and that training controllers concurrently with a corresponding reset controller only minimally increases training time. We also demonstrate this method on a complex tool use task that consists of seven stages and requires using a toy wrench to screw in a bolt. This compound task requires grasping and handling complex contact dynamics. After training, the controllers can execute the entire task quickly and efficiently. Finally, we show that this method can be combined with guided policy search to automatically train nonlinear neural network controllers for a grasping task with considerable variation in target position.

Authors

Keywords

  • Heuristic algorithms
  • Training
  • Learning (artificial intelligence)
  • Robots
  • Compounds
  • Trajectory
  • Neural networks
  • Unknown Dynamics
  • Neural Network
  • Variable Positions
  • Robot Manipulator
  • Nonlinear Network
  • Neural Network Control
  • Application Of Reinforcement Learning
  • Policy Search
  • Nonlinear Neural Networks
  • Learning Algorithms
  • Dynamic Model
  • Cost Function
  • Optimal Control
  • Brownian Motion
  • Variety Of Tasks
  • Control Sequence
  • Gaussian Mixture Model
  • Manipulation Tasks
  • Single Control
  • Model-based Reinforcement Learning
  • Point In Frame
  • Constrained Problem
  • Learning Control
  • Problem For Equation
  • Constrained Optimization
  • Distribution Of Trajectories
  • Time-varying Dynamics
  • Linear Dynamics
  • Time-varying Control

Context

Venue
IEEE/RSJ International Conference on Intelligent Robots and Systems
Archive span
1988-2025
Indexed papers
26578
Paper id
1119547750969647331