PLOTS: Procedure Learning from Observations using subTask Structure

Tong Mu; Karan Goel; Emma Brunskill

Back to AAMAS

AAMAS 2019

PLOTS: Procedure Learning from Observations using subTask Structure

Conference Paper 4A: Learning Agent Capabilities Autonomous Agents and Multiagent Systems

PDF

Abstract

In many cases an intelligent agent may want to learn how to mimic a single observed demonstrated trajectory. In this work we consider how to perform such procedural learning from observation, which could help to enable agents to better use the enormous set of video data on observation sequences. Our approach exploits the properties of this setting to incrementally build an open loop action plan that can yield the desired subsequence, and can be used in both Markov and partially observable Markov domains. In addition, procedures commonly involve repeated extended temporal action subsequences. Our method optimistically explores actions to leverage potential repeated structure in the procedure. In comparing to some state-of-the-art approaches we find that our explicit procedural learning from observation method is about 100 times faster than policy-gradient based approaches that learn a stochastic policy and is faster than model based approaches as well. We also find that performing optimistic action selection yields substantial speed ups when latent dynamical structure is present.

Authors

Keywords

Reinforcement Learning
Learning from Demonstration
Behavior
Cloning
Hierarchy

Context

Venue: International Conference on Autonomous Agents and Multiagent Systems
Archive span: 2002-2025
Indexed papers: 7403
Paper id: 777939828085410656