Arrow Research search

Author name cluster

Maxime Bouton

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

AAAI Conference 2020 Conference Paper

Point-Based Methods for Model Checking in Partially Observable Markov Decision Processes

  • Maxime Bouton
  • Jana Tumova
  • Mykel J. Kochenderfer

Autonomous systems are often required to operate in partially observable environments. They must reliably execute a specified objective even with incomplete information about the state of the environment. We propose a methodology to synthesize policies that satisfy a linear temporal logic formula in a partially observable Markov decision process (POMDP). By formulating a planning problem, we show how to use pointbased value iteration methods to efficiently approximate the maximum probability of satisfying a desired logical formula and compute the associated belief state policy. We demonstrate that our method scales to large POMDP domains and provides strong bounds on the performance of the resulting policy.

JAAMAS Journal 2019 Journal Article

Decomposition methods with deep corrections for reinforcement learning

  • Maxime Bouton
  • Kyle D. Julian
  • Mykel J. Kochenderfer

Abstract Decomposition methods have been proposed to approximate solutions to large sequential decision making problems. In contexts where an agent interacts with multiple entities, utility decomposition can be used to separate the global objective into local tasks considering each individual entity independently. An arbitrator is then responsible for combining the individual utilities and selecting an action in real time to solve the global problem. Although these techniques can perform well empirically, they rely on strong assumptions of independence between the local tasks and sacrifice the optimality of the global solution. This paper proposes an approach that improves upon such approximate solutions by learning a correction term represented by a neural network. We demonstrate this approach on a fisheries management problem where multiple boats must coordinate to maximize their catch over time as well as on a pedestrian avoidance problem for autonomous driving. In each problem, decomposition methods can scale to multiple boats or pedestrians by using strategies involving one entity. We verify empirically that the proposed correction method significantly improves the decomposition method and outperforms a policy trained on the full scale problem without utility decomposition.

ICRA Conference 2018 Conference Paper

Scalable Decision Making with Sensor Occlusions for Autonomous Driving

  • Maxime Bouton
  • Alireza Nakhaei
  • Kikuo Fujimura
  • Mykel J. Kochenderfer

Autonomous driving in urban areas requires avoiding other road users with only partial observability of the environment. Observations are only partial because obstacles can occlude the field of view of the sensors. The problem of robust and efficient navigation under uncertainty can be framed as a partially observable Markov decision process (POMDP). In order to bypass the computational cost of scaling the formulation to avoiding multiple road users, this paper demonstrates a decomposition method that leverages the optimal avoidance strategy for a single user. We evaluate the performance of two POMDP solution techniques augmented with the decomposition method for scenarios involving a pedestrian crosswalk and an intersection.

AAMAS Conference 2018 Conference Paper

Utility Decomposition for Planning under Uncertainty for Autonomous Driving

  • Maxime Bouton

The objective of this research is to provide scalable decision making algorithms for autonomously navigating urban environments. The vehicle must plan in a stochastic environment with many entities to avoid, rapid changes in driver behavior, and partial observability. Partially observable Markov decision processes (POMDP) offer a theoretically grounded framework to model such problems. We aim at developing a scalable POMDP formulation that takes into account dynamic occlusions, interaction between entities, and can generalize to a variety of different scenarios. This work demonstrates utility fusion and deep reinforcement learning methods to efficiently find optimal policies to navigate occluded urban environments.

AAMAS Conference 2018 Conference Paper

Utility Decomposition with Deep Corrections for Scalable Planning under Uncertainty

  • Maxime Bouton
  • Kyle Julian
  • Alireza Nakhaei
  • Kikuo Fujimura
  • Mykel J. Kochenderfer

Decomposition methods have been proposed in the past to approximate solutions to large sequential decision making problems. In contexts where an agent interacts with multiple entities, utility decomposition can be used where each individual entity is considered independently. The individual utility functions are then combined in real time to solve the global problem. Although these techniques can perform well empirically, they sacrifice optimality. This paper proposes an approach inspired from multi-fidelity optimization to learn a correction term with a neural network representation. Learning this correction can significantly improve performance. We demonstrate this approach on a pedestrian avoidance problem for autonomous driving. By leveraging strategies to avoid a single pedestrian, the decomposition method can scale to avoid multiple pedestrians. We verify empirically that the proposed correction method leads to a significant improvement over the decomposition method alone and outperforms a policy trained on the full scale problem without utility decomposition.