Dirichlet Process Reinforcement Learning

Teodor Mihai Moldovan; Michael Jordan; Pieter Abbeel

RLDM 2013

Dirichlet Process Reinforcement Learning

Conference Abstract Accepted abstract Artificial Intelligence · Decision Making · Machine Learning · Reinforcement Learning

PDF Details

Abstract

We consider the problem of model-based reinforcement learning in continuous state-action spaces. The key ingredients of our algorithm are: (i) To model the (initially unknown) dynamics our algorithm uses a Dirichlet process mixture of linear models. We present a novel method for on-line inference with this model which enables to learn continuously at high frequency. (ii) To address the exploration-exploitation trade-off, we describe how to adapt BOLT [1], an established method for discrete systems, to make it prac- tical for continuous systems. (iii) Efficient control is possible by relying on recent advances in sequential quadratic programming (SQP). Our algorithm is highly automated, requiring only two scalar parameters, and it is designed for parallel computation. Experiments show that it can solve the classical cartpole and under-actuated pendulum swing-up tasks as well as a new helicopter 180-degree flip followed by inverted hover task with minimal re-tuning of these two parameters for each new system.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Multidisciplinary Conference on Reinforcement Learning and Decision Making
Archive span: 2013-2025
Indexed papers: 1004
Paper id: 1092451110698461019