Arrow Research search

Author name cluster

Matthew Landen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2020 Conference Paper

Planning with Abstract Learned Models While Learning Transferable Subtasks

  • John Winder
  • Stephanie Milani
  • Matthew Landen
  • Erebus Oh
  • Shane Parr
  • Shawn Squire
  • Marie desJardins
  • Cynthia Matuszek

We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable for probabilistic planning at multiple levels of abstraction. We call this framework Planning with Abstract Learned Models (PALM). By representing subtasks symbolically using a new formal structure, the lifted abstract Markov decision process (L-AMDP), PALM learns models that are independent and modular. Through our experiments, we show how PALM integrates planning and execution, facilitating a rapid and efficient learning of abstract, hierarchical models. We also demonstrate the increased potential for learned models to be transferred to new and related tasks.

RLDM Conference 2017 Conference Abstract

R-AMDP: Model-Based Learning for Abstract Markov Decision Process Hierarchies

  • Shawn Squire
  • John Winder
  • Matthew Landen
  • Stephanie Milani

Decision-making agents face immensely challenging planning problems when operating in large environments to solve complex tasks. A hierarchy of abstract Markov decision processes (AMDPs) provides a framework for decomposing such problems into distinct, related subtasks. AMDP hierarchies grant con- siderable speedup over related recursively and hierarchically optimal methods such as MAXQ and options. Each AMDP serves as a subgoal, and each is itself a planning problem with a local model and state space abstracted from a ground MDP. Agents are able to plan more efficiently by using a reduced state space at the appropriate level of abstraction; however, they require their subtask models to be specified by a human expert. We describe an approach for automating model estimation by combining the R-Max algorithm with AMDPs. We compare the resulting structures, R-AMDPs, with a similar approach, RMAXQ, and motivate its advantages. Ultimately, R-AMDPs represent the first step in learning AMDP hierarchies dynamically, completely from an agent’s experience.