Arrow Research search

Author name cluster

John Vian

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
1 author row

Possible papers

7

ICML Conference 2017 Conference Paper

Deep Decentralized Multi-task Multi-Agent Reinforcement Learning under Partial Observability

  • Shayegan Omidshafiei
  • Jason Pazis
  • Christopher Amato
  • Jonathan P. How
  • John Vian

Many real-world tasks involve multiple agents with partial observability and limited communication. Learning is challenging in these settings due to local viewpoints of agents, which perceive the world as non-stationary due to concurrently-exploring teammates. Approaches that learn specialized policies for individual tasks face problems when applied to the real world: not only do agents have to learn and store distinct policies for each task, but in practice identities of tasks are often non-observable, making these approaches inapplicable. This paper formalizes and addresses the problem of multi-task multi-agent reinforcement learning under partial observability. We introduce a decentralized single-task learning approach that is robust to concurrent interactions of teammates, and present an approach for distilling single-task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity.

ICRA Conference 2017 Conference Paper

Scalable accelerated decentralized multi-robot policy search in continuous observation spaces

  • Shayegan Omidshafiei
  • Christopher Amato
  • Miao Liu 0001
  • Michael Everett
  • Jonathan P. How
  • John Vian

This paper presents the first ever approach for solving continuous-observation Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and their semi-Markovian counterparts, Dec-POSMDPs. This contribution is especially important in robotics, where a vast number of sensors provide continuous observation data. A continuous-observation policy representation is introduced using Stochastic Kernel-based Finite State Automata (SK-FSAs). An SK-FSA search algorithm titled Entropy-based Policy Search using Continuous Kernel Observations (EPSCKO) is introduced and applied to the first ever continuous-observation Dec-POMDP/Dec-POSMDP domain, where it significantly outperforms state-of-the-art discrete approaches. This methodology is equally applicable to Dec-POMDPs and Dec-POSMDPs, though the empirical analysis presented focuses on Dec-POSMDPs due to their higher scalability. To improve convergence, an entropy injection policy search acceleration approach for both continuous and discrete observation cases is also developed and shown to improve convergence rates without degrading policy quality.

ICRA Conference 2017 Conference Paper

Semantic-level decentralized multi-robot decision-making using probabilistic macro-observations

  • Shayegan Omidshafiei
  • Shih-Yuan Liu
  • Michael Everett
  • Brett T. Lopez
  • Christopher Amato
  • Miao Liu 0001
  • Jonathan P. How
  • John Vian

Robust environment perception is essential for decision-making on robots operating in complex domains. Intelligent task execution requires principled treatment of uncertainty sources in a robot's observation model. This is important not only for low-level observations (e. g. , accelerom-eter data), but also for high-level observations such as semantic object labels. This paper formalizes the concept of macro-observations in Decentralized Partially Observable Semi-Markov Decision Processes (Dec-POSMDPs), allowing scalable semantic-level multi-robot decision making. A hierarchical Bayesian approach is used to model noise statistics of low-level classifier outputs, while simultaneously allowing sharing of domain noise characteristics between classes. Classification accuracy of the proposed macro-observation scheme, called Hierarchical Bayesian Noise Inference (HBNI), is shown to exceed existing methods. The macro-observation scheme is then integrated into a Dec-POSMDP planner, with hardware experiments running onboard a team of dynamic quadrotors in a challenging domain where noise-agnostic filtering fails. To the best of our knowledge, this is the first demonstration of a real-time, convolutional neural net-based classification framework running fully onboard a team of quadrotors in a multi-robot decision-making domain.

ICRA Conference 2016 Conference Paper

Graph-based Cross Entropy method for solving multi-robot decentralized POMDPs

  • Shayegan Omidshafiei
  • Ali-Akbar Agha-Mohammadi
  • Christopher Amato
  • Shih-Yuan Liu
  • Jonathan P. How
  • John Vian

This paper introduces a probabilistic algorithm for multi-robot decision-making under uncertainty, which can be posed as a Decentralized Partially Observable Markov Decision Process (Dec-POMDP). Dec-POMDPs are inherently synchronous decision-making frameworks which require significant computational resources to be solved, making them infeasible for many real-world robotics applications. The Decentralized Partially Observable Semi-Markov Decision Process (Dec-POSMDP) was recently introduced as an extension of the Dec-POMDP that uses high-level macro-actions to allow large-scale, asynchronous decision-making. However, existing Dec-POSMDP solution methods have limited scalability or perform poorly as the problem size grows. This paper proposes a cross-entropy based Dec-POSMDP algorithm motivated by the combinatorial optimization literature. The algorithm is applied to a constrained package delivery domain, where it significantly outperforms existing Dec-POSMDP solution methods.

IROS Conference 2015 Conference Paper

Online heterogeneous multiagent learning under limited communication with applications to forest fire management

  • Nazim Kemal Ure
  • Shayegan Omidshafiei
  • Brett T. Lopez
  • Ali-Akbar Agha-Mohammadi
  • Jonathan P. How
  • John Vian

Many robotic missions require online estimation of the unknown state transition models associated with uncertainty that stems from mission dynamics. The learning problem is usually distributed among agents in multiagent scenarios, either due to the absence of a centralized processing unit or because of the large size of the joint learning problem. This paper addresses the problem of multiagent learning in the likely scenario that agents estimate different models from their measured data, but they can share information by communicating model parameters. Previous approaches either consider homogeneous scenarios or perform model transfer in an open-loop manner, which hinders the convergence rate. We develop a closed-loop multiagent learning algorithm, Collaborative Filtering-Decentralized Incremental Feature Dependency Discovery (CF-Dec-iFDD), which enables agents to learn and share models in heterogeneous scenarios. Each agent learns a linear function approximation of the actual model, and the number of features is increased incrementally to adjust model complexity based on the observed data. The agents obtain feedback from other agents on the model error reduction associated with the communicated features. Although this increases the communication cost of exchanging features, it improves the quality/utility of what is being exchanged, leading to improved convergence rate. The approach is demonstrated in indoor hardware flight tests on a forest fire management scenario for which agents must learn the transition model of the fire spread depending on external factors such as wind and vegetation. It is shown that CF-Dec-iFDD has superior convergence rate compared to the alternative approaches.

IROS Conference 2014 Conference Paper

Health aware stochastic planning for persistent package delivery missions using quadrotors

  • Ali-Akbar Agha-Mohammadi
  • Nazim Kemal Ure
  • Jonathan P. How
  • John Vian

In persistent missions, taking system's health and capability degradation into account is an essential factor to predict and avoid failures. The state space in health-aware planning problems is often a mixture of continuous vehicle-level and discrete mission-level states. This in particular poses a challenge when the mission domain is partially observable and restricts the use of computationally expensive forward search methods. This paper presents a method that exploits a structure that exists in many health-aware planning problems and performs a two-layer planning scheme. The lower layer exploits the local linearization and Gaussian distribution assumption over vehicle-level states while the higher layer maintains a non-Gaussian distribution over discrete mission-level variables. This two-layer planning scheme allows us to limit the expensive online forward search to the mission-level states, and thus predict system's behavior over longer horizons in the future. We demonstrate the performance of the method on a long duration package delivery mission using a quadrotor in a partially-observable domain in the presence of constraints and health/capability degradation.

ICRA Conference 2007 Conference Paper

The MIT Indoor Multi-Vehicle Flight Testbed

  • Mario J. Valenti
  • Brett Bethke
  • Daniel Dale
  • Adrian A. Frank
  • James S. McGrew
  • Spencer Ahrens
  • Jonathan P. How
  • John Vian

This paper and video present the components and flight tests of an indoor, multi-vehicle testbed that was developed to study long duration UAV missions in a controlled environment. This testbed is designed to use real hardware to examine research questions related to single-and multi-vehicle health management, such as vehicle failures, refueling, and maintenance. The testbed has both aerial and ground vehicles that operate autonomously in a large, indoor flight test area and can be used to execute many different mission scenarios. The success of this testbed is largely related to our choice of vehicles, sensors, and the system's command and control architecture. The video presents flight test results from single-and multi-vehicle experiments over the past year.