Arrow Research search

Author name cluster

David Isele

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

23 papers
2 author rows

Possible papers

23

AAAI Conference 2026 Conference Paper

Measuring What Matters: Scenario-Driven Evaluation for Trajectory Predictors in Autonomous Driving

  • Longchao Da
  • David Isele
  • Hua Wei
  • Manish Saroya

Being able to anticipate the motion of surrounding agents is essential for the safe operation of autonomous driving systems in dynamic situations. While various methods have been proposed for trajectory prediction, the current evaluation practices still rely on error-based metrics (e.g., ADE, FDE), which reveal the accuracy from a post-hoc view but ignore the actual effect the predictor brings to the self-driving vehicles (SDVs), especially in complex interactive scenarios: a high-quality predictor not only chases accuracy, but should also captures all possible directions a neighbor agent might move, to support the SDVs' cautious decision-making. Given that the existing metrics hardly account for this standard, in our work, we propose a comprehensive pipeline that adaptively evaluates the predictor's performance by two dimensions: accuracy and diversity. Based on the criticality of the driving scenario, these two dimensions are dynamically combined and result in a final score for the predictor's performance. Extensive experiments on a closed-loop benchmark using a real-world dataset show that our pipeline yields a more reasonable evaluation than traditional metrics by better reflecting the correlation of the predictors' evaluation with the autonomous vehicles' driving performance. This evaluation pipeline shows a robust way to select a predictor that potentially contributes most to the SDV's driving performance.

IROS Conference 2025 Conference Paper

Active Probing with Multimodal Predictions for Motion Planning

  • Darshan Gadginmath
  • Farhad Nawaz
  • Minjun Sung
  • Faizan M. Tariq
  • Sangjae Bae
  • David Isele
  • Fabio Pasqualetti
  • Jovin D'sa

Navigation in dynamic environments requires autonomous systems to reason about uncertainties in the behavior of other agents. In this paper, we introduce a unified framework that combines trajectory planning with multimodal predictions and active probing to enhance decision-making under uncertainty. We develop a novel risk metric that seamlessly integrates multimodal prediction uncertainties through mixture models. When these uncertainties follow a Gaussian mixture distribution, we prove that our risk metric admits a closed-form solution, and is always finite, thus ensuring analytical tractability. To reduce prediction ambiguity, we incorporate an active probing mechanism that strategically selects actions to improve its estimates of behavioral parameters of other agents, while simultaneously handling multimodal uncertainties. We extensively evaluate our framework in autonomous navigation scenarios using the MetaDrive simulation environment. Results demonstrate that our active probing approach successfully navigates complex traffic scenarios with uncertain predictions. Additionally, our framework shows robust performance across diverse traffic agent behavior models, indicating its broad applicability to real-world autonomous navigation challenges.

ICRA Conference 2025 Conference Paper

Delayed-Decision Motion Planning in the Presence of Multiple Predictions

  • David Isele
  • Alexandre Miranda Añon
  • Faizan M. Tariq
  • Goro Yeh
  • Avinash T. Singh
  • Sangjae Bae

Reliable automated driving technology is challenged by various sources of uncertainties, in particular, behavioral uncertainties of traffic agents. It is common for traffic agents to have intentions that are unknown to others, leaving an automated driving car to reason over multiple possible behaviors. This paper formalizes a behavior planning scheme in the presence of multiple possible futures with corresponding probabilities. We present a maximum entropy formulation and show how, under certain assumptions, this allows delayed decision-making to improve safety. The general formulation is then turned into a model predictive control formulation, which is solved as a quadratic program or a set of quadratic programs. We discuss implementation details for improving computation and verify operation in simulation and on a mobile robot.

ICRA Conference 2025 Conference Paper

Generalized Mission Planning for Heterogeneous Multi-Robot Teams via LLM-Constructed Hierarchical Trees

  • Piyush Gupta
  • David Isele
  • Enna Sachdeva
  • Pin-Hao Huang
  • Behzad Dariush
  • Kwonjoon Lee
  • Sangjae Bae

We present a novel mission-planning strategy for heterogeneous multi-robot teams, taking into account the specific constraints and capabilities of each robot. Our approach employs hierarchical trees to systematically break down complex missions into manageable sub-tasks. We develop specialized APIs and tools, which are utilized by Large Language Models (LLMs) to efficiently construct these hierarchical trees. Once the hierarchical tree is generated, it is further decomposed to create optimized schedules for each robot, ensuring adherence to their individual constraints and capabilities. We demonstrate the effectiveness of our framework through detailed examples covering a wide range of missions, showcasing its flexibility and scalability.

ICRA Conference 2024 Conference Paper

Multi-Profile Quadratic Programming (MPQP) for Optimal Gap Selection and Speed Planning of Autonomous Driving

  • Alexandre Miranda Añon
  • Sangjae Bae
  • Manish Saroya
  • David Isele

Smooth and safe speed planning is imperative for the successful deployment of autonomous vehicles. This paper presents a mathematical formulation for the optimal speed planning of autonomous driving, which has been validated in high-fidelity simulations and real-road demonstrations with practical constraints. The algorithm explores the inter-traffic gaps in the time and space domain using a breadth-first search. For each gap, quadratic programming finds an optimal speed profile, synchronizing the time and space pair along with dynamic obstacles. Qualitative and quantitative analysis in Carla is reported to discuss the smoothness and robustness of the proposed algorithm. Finally, we present a road demonstration result for urban city driving.

ICRA Conference 2023 Conference Paper

Interaction-Aware Trajectory Planning for Autonomous Vehicles with Analytic Integration of Neural Networks into Model Predictive Control

  • Piyush Gupta
  • David Isele
  • Donggun Lee
  • Sangjae Bae

Autonomous vehicles (AVs) must share the driving space with other drivers and often employ conservative motion planning strategies to ensure safety. These conservative strategies can negatively impact AV's performance and significantly slow traffic throughput. Therefore, to avoid conservatism, we design an interaction-aware motion planner for the ego vehicle (AV) that interacts with surrounding vehicles to perform complex maneuvers in a locally optimal manner. Our planner uses a neural network-based interactive trajectory predictor and analytically integrates it with model predictive control (MPC). We solve the MPC optimization using the alternating direction method of multipliers (ADMM) and prove the algorithm's convergence. We provide an empirical study and compare our method with a baseline heuristic method.

AAAI Conference 2022 Conference Paper

Recursive Reasoning Graph for Multi-Agent Reinforcement Learning

  • Xiaobai Ma
  • David Isele
  • Jayesh K. Gupta
  • Kikuo Fujimura
  • Mykel J. Kochenderfer

Multi-agent reinforcement learning (MARL) provides an efficient way for simultaneously learning policies for multiple agents interacting with each other. However, in scenarios requiring complex interactions, existing algorithms can suffer from an inability to accurately anticipate the influence of selfactions on other agents. Incorporating an ability to reason about other agents’ potential responses can allow an agent to formulate more effective strategies. This paper adopts a recursive reasoning model in a centralized-training-decentralizedexecution framework to help learning agents better cooperate with or compete against others. The proposed algorithm, referred to as the Recursive Reasoning Graph (R2G), shows state-of-the-art performance on multiple multi-agent particle and robotics games.

IROS Conference 2022 Conference Paper

Risk-sensitive MPCs with Deep Distributional Inverse RL for Autonomous Driving

  • Keuntaek Lee
  • David Isele
  • Evangelos A. Theodorou
  • Sangjae Bae

In robot learning from demonstration (LfD), a visual representation of a cost function inferred from Inverse Reinforcement Learning (IRL) provides an intuitive tool for humans to quickly interpret the underlying objectives of the demonstration. The inferred cost function can be used by controllers, for example, Model Predictive Controllers (MPCs). In this work, we improve the recently developed IRL-MPC framework, by enhancing it in a risk-sensitive formulation to be more applicable for safety-critical applications like autonomous driving. Our risk-sensitive MPCs together with the distributional costmap demonstrate lower collision rates in the CARLA simulator for autonomous driving tasks compared to other learning-based baseline methods.

ICRA Conference 2021 Conference Paper

Anytime Game-Theoretic Planning with Active Reasoning About Humans' Latent States for Human-Centered Robots

  • Ran Tian
  • Liting Sun
  • Masayoshi Tomizuka
  • David Isele

A human-centered robot needs to reason about the cognitive limitation and potential irrationality of its human partner to achieve seamless interactions. This paper proposes an anytime game-theoretic planner that integrates iterative reasoning models, a partially observable Markov decision process, and chance-constrained Monte-Carlo belief tree search for robot behavioral planning. Our planner enables a robot to safely and actively reason about its human partner’s latent cognitive states (bounded intelligence and irrationality) in real-time to maximize its utility better. We validate our approach in an autonomous driving domain where our behavioral planner and a low-level motion controller hierarchically control an autonomous car to negotiate traffic merges. Simulations and user studies are conducted to show our planner’s effectiveness.

ICRA Conference 2021 Conference Paper

Reinforcement Learning for Autonomous Driving with Latent State Inference and Spatial-Temporal Relationships

  • Xiaobai Ma
  • Jiachen Li 0001
  • Mykel J. Kochenderfer
  • David Isele
  • Kikuo Fujimura

Deep reinforcement learning (DRL) provides a promising way for learning navigation in complex autonomous driving scenarios. However, identifying the subtle cues that can indicate drastically different outcomes remains an open problem with designing autonomous systems that operate in human environments. In this work, we show that explicitly inferring the latent state and encoding spatial-temporal relationships in a reinforcement learning framework can help address this difficulty. We encode prior knowledge on the latent states of other drivers through a framework that combines the reinforcement learner with a supervised learner. In addition, we model the influence passing between different vehicles through graph neural networks (GNNs). The proposed framework significantly improves performance in the context of navigating T-intersections compared with state-of-the-art baseline approaches.

IROS Conference 2020 Conference Paper

A Comprehensive Trajectory Planner for a Person-Following ATV

  • Huckleberry Febbo
  • Jiawei Huang 0006
  • David Isele

This paper presents a trajectory planning algorithm for person following that is more comprehensive than existing algorithms. This algorithm is tailored for a front-wheel-steered vehicle, is designed to follow a person while avoiding collisions with both static and moving obstacles, simultaneously optimizing speed and steering, and minimizing control effort. This algorithm uses nonlinear model predictive control, where the underling trajectory optimization problem is approximated using a simultaneous method. Results collected in an unknown environment show that the proposed planning algorithm works well with a perception algorithm to follow a person in uneven grass near obstacles and over ditches and curbs, and on asphalt over train-tracks and near buildings and cars. Overall, the results indicate that the proposed algorithm can safely follow a person in unknown, dynamic environments.

ICLR Conference 2020 Conference Paper

CM3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning

  • Jiachen Yang
  • Alireza Nakhaei
  • David Isele
  • Kikuo Fujimura
  • Hongyuan Zha

A variety of cooperative multi-agent control problems require agents to achieve individual goals while contributing to collective success. This multi-goal multi-agent setting poses difficulties for recent algorithms, which primarily target settings with a single global reward, due to two new challenges: efficient exploration for learning both individual goal attainment and cooperation for others' success, and credit-assignment for interactions between actions and goals of different agents. To address both challenges, we restructure the problem into a novel two-stage curriculum, in which single-agent goal attainment is learned prior to learning multi-agent cooperation, and we derive a new multi-goal multi-agent policy gradient with a credit function for localized credit assignment. We use a function augmentation scheme to bridge value and policy functions across the curriculum. The complete architecture, called CM3, learns significantly faster than direct adaptations of existing algorithms on three challenging multi-goal multi-agent problems: cooperative navigation in difficult formations, negotiating multi-vehicle lane changes in the SUMO traffic simulator, and strategic cooperation in a Checkers environment.

JAIR Journal 2020 Journal Article

Using Task Descriptions in Lifelong Machine Learning for Improved Performance and Zero-Shot Transfer

  • Mohammad Rostami
  • David Isele
  • Eric Eaton

Knowledge transfer between tasks can improve the performance of learned models, but requires an accurate estimate of inter-task relationships to identify the relevant knowledge to transfer. These inter-task relationships are typically estimated based on training data for each task, which is inefficient in lifelong learning settings where the goal is to learn each consecutive task rapidly from as little data as possible. To reduce this burden, we develop a lifelong learning method based on coupled dictionary learning that utilizes high-level task descriptions to model inter-task relationships. We show that using task descriptors improves the performance of the learned task policies, providing both theoretical justification for the benefit and empirical demonstration of the improvement across a variety of learning problems. Given only the descriptor for a new task, the lifelong learner is also able to accurately predict a model for the new task through zero-shot learning using the coupled dictionary, eliminating the need to gather training data before addressing the task.

ICRA Conference 2019 Conference Paper

Interaction-Aware Multi-Agent Reinforcement Learning for Mobile Agents with Individual Goals

  • Anahita Mohseni-Kabir
  • David Isele
  • Kikuo Fujimura

In a multi-agent setting, the optimal policy of a single agent is largely dependent on the behavior of other agents. We investigate the problem of multi-agent reinforcement learning, focusing on decentralized learning in non-stationary domains for mobile robot navigation. We identify a cause for the difficulty in training non-stationary policies: mutual adaptation to sub-optimal behaviors, and we use this to motivate a curriculum-based strategy for learning interactive policies. The curriculum has two stages. First, the agent leverages policy gradient algorithms to learn a policy that is capable of achieving multiple goals. Second, the agent learns a modifier policy to learn how to interact with other agents in a multi-agent setting. We evaluated our approach on both an autonomous driving lane-change domain and a robot navigation domain.

ICRA Conference 2019 Conference Paper

Uncertainty-Aware Data Aggregation for Deep Imitation Learning

  • Yuchen Cui
  • David Isele
  • Scott Niekum
  • Kikuo Fujimura

Estimating statistical uncertainties allows autonomous agents to communicate their confidence during task execution and is important for applications in safety-critical domains such as autonomous driving. In this work, we present the uncertainty-aware imitation learning (UAIL) algorithm for improving end-to-end control systems via data aggregation. UAIL applies Monte Carlo Dropout to estimate uncertainty in the control output of end-to-end systems, using states where it is uncertain to selectively acquire new training data. In contrast to prior data aggregation algorithms that force human experts to visit sub-optimal states at random, UAIL can anticipate its own mistakes and switch control to the expert in order to prevent visiting a series of sub-optimal states. Our experimental results from simulated driving tasks demonstrate that our proposed uncertainty estimation method can be leveraged to reliably predict infractions. Our analysis shows that UAIL outperforms existing data aggregation algorithms on a series of benchmark tasks.

AAAI Conference 2018 Short Paper

Comparing Reward Shaping, Visual Hints, and Curriculum Learning

  • Rey Pocius
  • David Isele
  • Mark Roberts
  • David Aha

Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum. Yet few studies examine how they compare to each other, when one might prefer one approach, or how they may complement each other. As a first step in this direction, we compare reward shaping, hints, and curricula for a Deep RL agent in the game of Minecraft. We seek to answer whether reward shaping, visual hints, or the curricula have the most impact on performance, which we measure as the time to reach the target, the distance from the target, the cumulative reward, or the number of actions taken. Our analyses show that performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, the results suggest that designing an effective curriculum and providing appropriate hints most improve the performance. Common approaches to learn complex tasks in reinforcement learning include reward shaping, environmental hints, or a curriculum, yet few studies examine how they compare to each other. We compare these approaches for a Deep RL agent in the game of Minecraft and show performance is most impacted by the curriculum used and visual hints; shaping had less impact. For similar navigation tasks, this suggests that designing an effective curriculum with hints most improve the performance.

ICRA Conference 2018 Conference Paper

Navigating Occluded Intersections with Autonomous Vehicles Using Deep Reinforcement Learning

  • David Isele
  • Reza Rahimi
  • Akansel Cosgun
  • Kaushik Subramanian
  • Kikuo Fujimura

Providing an efficient strategy to navigate safely through unsignaled intersections is a difficult task that requires determining the intent of other drivers. We explore the effectiveness of Deep Reinforcement Learning to handle intersection problems. Using recent advances in Deep RL, we are able to learn policies that surpass the performance of a commonly-used heuristic approach in several metrics including task completion time and goal success rate and have limited ability to generalize. We then explore a system's ability to learn active sensing behaviors to enable navigating safely in the case of occlusions. Our analysis, provides insight into the intersection handling problem, the solutions learned by the network point out several shortcomings of current rule-based methods, and the failures of our current deep reinforcement learning system point to future research directions.

IROS Conference 2018 Conference Paper

Safe Reinforcement Learning on Autonomous Vehicles

  • David Isele
  • Alireza Nakhaei
  • Kikuo Fujimura

There have been numerous advances in reinforcement learning, but the typically unconstrained exploration of the learning process prevents the adoption of these methods in many safety critical applications. Recent work in safe reinforcement learning uses idealized models to achieve their guarantees, but these models do not easily accommodate the stochasticity or high-dimensionality of real world systems. We investigate how prediction provides a general and intuitive framework to constraint exploration, and show how it can be used to safely learn intersection handling behaviors on an autonomous vehicle.

AAAI Conference 2018 Conference Paper

Selective Experience Replay for Lifelong Learning

  • David Isele
  • Akansel Cosgun

Deep reinforcement learning has emerged as a powerful tool for a variety of learning tasks, however deep nets typically exhibit forgetting when learning multiple tasks in sequence. To mitigate forgetting, we propose an experience replay process that augments the standard FIFO buffer and selectively stores experiences in a long-term memory. We explore four strategies for selecting which experiences will be stored: favoring surprise, favoring reward, matching the global training distribution, and maximizing coverage of the state space. We show that distribution matching successfully prevents catastrophic forgetting, and is consistently the best approach on all domains tested. While distribution matching has better and more consistent performance, we identify one case in which coverage maximization is beneficial - when tasks that receive less trained are more important. Overall, our results show that selective experience replay, when suitable selection algorithms are employed, can prevent catastrophic forgetting.

AAAI Conference 2017 Short Paper

Representations for Continuous Learning

  • David Isele

Systems deployed in unstructured environments must be able to adapt to novel situations. This requires the ability to perform in domains that may be vastly different from training domains. My dissertation focuses on the representations used in lifelong learning and how these representations enable predictions and knowledge sharing over time, allowing an agent to continuously learn and adapt in changing environments. Specifically, my contributions will enable lifelong learning systems to efficiently accumulate data, use prior knowledge to predict models for novel tasks, and alter existing models to account for changes in the environment.

IROS Conference 2016 Conference Paper

Lifelong learning for disturbance rejection on mobile robots

  • David Isele
  • José-Marcio Luna
  • Eric Eaton
  • Gabriel Victor de la Cruz
  • James Irwin
  • Brandon Kallaher
  • Matthew E. Taylor

No two robots are exactly the same—even for a given model of robot, different units will require slightly different controllers. Furthermore, because robots change and degrade over time, a controller will need to change over time to remain optimal. This paper leverages lifelong learning in order to learn controllers for different robots. In particular, we show that by learning a set of control policies over robots with different (unknown) motion models, we can quickly adapt to changes in the robot, or learn a controller for a new robot with a unique set of disturbances. Furthermore, the approach is completely model-free, allowing us to apply this method to robots that have not, or cannot, be fully modeled.

IJCAI Conference 2016 Conference Paper

Using Task Features for Zero-Shot Knowledge Transfer in Lifelong Learning

  • David Isele
  • Mohammad Rostami
  • Eric Eaton

Knowledge transfer between tasks can improve the performance of learned models, but requires an accurate estimate of the inter-task relationships to identify the relevant knowledge to transfer. These inter-task relationships are typically estimated based on training data for each task, which is inefficient in lifelong learning settings where the goal is to learn each consecutive task rapidly from as little data as possible. To reduce this burden, we develop a lifelong reinforcement learning method based on coupled dictionary learning that incorporates high-level task descriptors to model the inter-task relationships. We show that using task descriptors improves the performance of the learned task policies, providing both theoretical justification for the benefit and empirical demonstration of the improvement across a variety of dynamical control problems. Given only the descriptor for a new task, the lifelong learner is also able to accurately predict the task policy through zero-shot learning using the coupled dictionary, eliminating the need to pause to gather training data before addressing the task.