Arrow Research search

Author name cluster

Shawn Squire

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

AAAI Conference 2020 Conference Paper

Planning with Abstract Learned Models While Learning Transferable Subtasks

  • John Winder
  • Stephanie Milani
  • Matthew Landen
  • Erebus Oh
  • Shane Parr
  • Shawn Squire
  • Marie desJardins
  • Cynthia Matuszek

We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable for probabilistic planning at multiple levels of abstraction. We call this framework Planning with Abstract Learned Models (PALM). By representing subtasks symbolically using a new formal structure, the lifted abstract Markov decision process (L-AMDP), PALM learns models that are independent and modular. Through our experiments, we show how PALM integrates planning and execution, facilitating a rapid and efficient learning of abstract, hierarchical models. We also demonstrate the increased potential for learned models to be transferred to new and related tasks.

ICAPS Conference 2017 Conference Paper

Planning with Abstract Markov Decision Processes

  • Nakul Gopalan
  • Marie desJardins
  • Michael L. Littman
  • James MacGlashan
  • Shawn Squire
  • Stefanie Tellex
  • John Winder
  • Lawson L. S. Wong

Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning under uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics problem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.

RLDM Conference 2017 Conference Abstract

Planning with Abstract Markov Decision Processes

  • Nakul Gopalan
  • Michael Littman
  • Shawn Squire
  • Stefanie Tellex
  • John Winder
  • Lawson Wong

Robots acting in human-scale environments must plan under uncertainty in large state–action spaces and face constantly changing reward functions as requirements and goals change. Planning un- der uncertainty in large state–action spaces requires hierarchical abstraction for efficient computation. We (Gopalan et al. 2017 In Press) introduce a new hierarchical planning framework called Abstract Markov Decision Processes (AMDPs) that can plan in a fraction of the time needed for complex decision making in ordinary MDPs. AMDPs provide abstract states, actions, and transition dynamics in multiple layers above a base-level “flat” MDP. AMDPs decompose problems into a series of subtasks with both local reward and local transition functions used to create policies for subtasks. The resulting hierarchical planning method is independently optimal at each level of abstraction, and is recursively optimal when the local reward and transition functions are correct. We present empirical results showing significantly improved planning speed, while maintaining solution quality, in the Taxi domain and in a mobile-manipulation robotics prob- lem. Furthermore, our approach allows specification of a decision-making model for a mobile-manipulation problem on a Turtlebot, spanning from low-level control actions operating on continuous variables all the way up through high-level object manipulation tasks.

RLDM Conference 2017 Conference Abstract

R-AMDP: Model-Based Learning for Abstract Markov Decision Process Hierarchies

  • Shawn Squire
  • John Winder
  • Matthew Landen
  • Stephanie Milani

Decision-making agents face immensely challenging planning problems when operating in large environments to solve complex tasks. A hierarchy of abstract Markov decision processes (AMDPs) provides a framework for decomposing such problems into distinct, related subtasks. AMDP hierarchies grant con- siderable speedup over related recursively and hierarchically optimal methods such as MAXQ and options. Each AMDP serves as a subgoal, and each is itself a planning problem with a local model and state space abstracted from a ground MDP. Agents are able to plan more efficiently by using a reduced state space at the appropriate level of abstraction; however, they require their subtask models to be specified by a human expert. We describe an approach for automating model estimation by combining the R-Max algorithm with AMDPs. We compare the resulting structures, R-AMDPs, with a similar approach, RMAXQ, and motivate its advantages. Ultimately, R-AMDPs represent the first step in learning AMDP hierarchies dynamically, completely from an agent’s experience.

AAAI Conference 2016 Conference Paper

Abstracting Complex Domains Using Modular Object-Oriented Markov Decision Processes

  • Shawn Squire
  • Marie desJardins

We present an initial proposal for modular objectoriented MDPs, an extension of OO-MDPs that abstracts complex domains that are partially observable and stochastic with multiple goals. Modes reduce the curse of dimensionality by reducing the number of attributes, objects, and actions into only the features relevant for each goal. These modes may also be used as an abstracted domain to be transferred to other modes or to another domain.

IJCAI Conference 2015 Conference Paper

Portable Option Discovery for Automated Learning Transfer in Object-Oriented Markov Decision Processes

  • Nicholay Topin
  • Nicholas Haltmeyer
  • Shawn Squire
  • John Winder
  • Marie desJardins
  • James MacGlashan

We introduce a novel framework for option discovery and learning transfer in complex domains that are represented as object-oriented Markov decision processes (OO-MDPs) [Diuk et al. , 2008]. Our framework, Portable Option Discovery (POD), extends existing option discovery methods, and enables transfer across related but different domains by providing an unsupervised method for finding a mapping between object-oriented domains with different state spaces. The framework also includes heuristic approaches for increasing the efficiency of the mapping process. We present the results of applying POD to Pickett and Barto’s [2002] Policy- Blocks and MacGlashan’s [2013] Option-Based Policy Transfer in two application domains. We show that our approach can discover options effectively, transfer options among different domains, and improve learning performance with low computational overhead.