Author name cluster

David Hsu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

57 papers

2 author rows

AAAI Conference 2026 Conference Paper

10 Open Challenges Steering the Future of Vision-Language-Action Models

Soujanya Poria
Navonil Majumder
Chia-Yu Hung
Amir Ali Bagherzadeh
Chuan Li
Kenneth Kwok
Ziwei Wang
Cheston Tan

Due to their ability of follow natural language instructions, vision-language-action (VLA) models are increasingly preva- lent in the embodied AI arena, following the widespread suc- cess of their precursors—LLMs and VLMs. In this paper, we discuss 10 principal milestones in the ongoing develop- ment of VLA models—multimodality, reasoning, data, eval- uation, cross-robkot action generalization, efficiency, whole- body coordination, safety, agents, and coordination with hu- mans. Furthermore, we discuss the emerging trends of us- ing spatial understanding, modeling world dynamics, post training, and data synthesis—all aiming to reach these mile- stones. Through these discussions, we hope to bring attention to the research avenues that may accelerate the development of VLA models into wider acceptability.

PDF Details DOI

ICRA Conference 2025 Conference Paper

General-Purpose Clothes Manipulation with Semantic Keypoints

Yuhong Deng
David Hsu

Clothes manipulation is a critical capability for household robots; yet, existing methods are often confined to specific tasks, such as folding or flattening, due to the complex high-dimensional geometry of deformable fabric. This paper presents CLothes mAnipulation with Semantic keyPoints (CLASP) for general-purpose clothes manipulation, which enables the robot to perform diverse manipulation tasks over different types of clothes. The key idea of CLASP is semantic keypoints-e. g. , “right shoulder”, “left sleeve”, etc. -a sparse spatial-semantic representation that is salient for both perception and action. Semantic keypoints of clothes can be effectively extracted from depth images and are sufficient to represent a broad range of clothes manipulation policies. CLASP leverages semantic keypoints to bridge LLM-powered task planning and low-level action execution in a two-level hierarchy. Extensive simulation experiments show that CLASP outperforms baseline methods across diverse clothes types in both seen and unseen tasks. Further, experiments with a Kinova dual-arm system on four distinct tasks-folding, flattening, hanging, and placing-confirm CLASP's performance on a real robot.

ICLR Conference 2025 Conference Paper

Neuralized Markov Random Field for Interaction-Aware Stochastic Human Trajectory Prediction

Zilin Fang
David Hsu
Gim Hee Lee

Interactive human motions and the continuously changing nature of intentions pose significant challenges for human trajectory prediction. In this paper, we present a neuralized Markov random field (MRF)-based motion evolution method for probabilistic interaction-aware human trajectory prediction. We use MRF to model each agent's motion and the resulting crowd interactions over time, hence is robust against noisy observations and enables group reasoning. We approximate the modeled distribution using two conditional variational autoencoders (CVAEs) for efficient learning and inference. Our proposed method achieves state-of-the-art performance on ADE/FDE metrics across two dataset categories: overhead datasets ETH/UCY, SDD, and NBA, and ego-centric JRDB. Furthermore, our approach allows for real-time stochastic inference in bustling environments, making it well-suited for a 30FPS video setting. We open-source our codes at: https://github.com/AdaCompNUS/NMRF_TrajectoryPrediction.git

AAMAS Conference 2025 Conference Paper

On the Effective Horizon of Inverse Reinforcement Learning

Yiqing Xu
Finale Doshi-Velez
David Hsu

Inverse reinforcement learning (IRL) algorithms often rely on (forward) reinforcement learning or planning, over a given time horizon, to compute an approximately optimal policy for a hypothesized reward function; they then match this policy with expert demonstrations. The time horizon plays a critical role in determining both the accuracy of reward estimates and the computational e"ciency of IRL algorithms. Interestingly, an e! ective time horizon shorter than the ground-truth value often produces better results faster. This work formally analyzes this phenomenon and provides an explanation: the time horizon controls the complexity of an induced policy class and mitigates over! tting with limited data. This analysis provides a guide for the principled choice of the e#ective horizon for IRL. It also prompts us to re-examine the classic IRL formulation: it is more natural to learn jointly the reward and the e#ective horizon rather than the reward alone with a given horizon. To validate our! ndings, we implement a cross-validation extension and the experimental results support the theoretical analysis. The project page and code are publicly available.

ICRA Conference 2025 Conference Paper

Robi Butler: Multimodal Remote Interaction with a Household Robot Assistant

Anxing Xiao
Nuwan Janaka
Tianrun Hu
Anshul Gupta
Kaixin Li
Cunjun Yu
David Hsu

Imagine a future when we can Zoom-call a robot to manage household chores remotely. This work takes one step in this direction. Robi Butler is a new household robot assistant that enables seamless multimodal remote interaction. It allows the human user to monitor its environment from a first-person view, issue voice or text commands, and specify target objects through hand-pointing gestures. At its core, a high-level behavior module, powered by Large Language Models (LLMs), interprets multimodal instructions to generate multistep action plans. Each plan consists of open-vocabulary primitives supported by vision-language models, enabling the robot to process both textual and gestural inputs. Zoom provides a convenient interface to implement remote interactions between the human and the robot. The integration of these components allows Robi Butler to ground remote multimodal instructions in real-world home environments in a zero-shot manner. We evaluated the system on various household tasks, demonstrating its ability to execute complex user commands with multimodal inputs. We also conducted a user study to examine how multimodal interaction influences user experiences in remote human-robot interaction. These results suggest that with the advances in robot foundation models, we are moving closer to the reality of remote household robot assistants.

ICRA Conference 2024 Conference Paper

Scene Action Maps: Behavioural Maps for Navigation without Metric Information

Joel Loo
David Hsu

Humans are remarkable in their ability to navigate without metric information. We can read abstract 2D maps, such as floor-plans or hand-drawn sketches, and use them to navigate in unseen rich 3D environments, without requiring prior traversals to map out these scenes in detail. We posit that this is enabled by the ability to represent the environment abstractly as interconnected navigational behaviours, e. g. , "follow the corridor" or "turn right", while avoiding detailed, accurate spatial information at the metric level. We introduce the Scene Action Map (SAM), a behavioural topological graph, and propose a learnable map-reading method, which parses a variety of 2D maps into SAMs. Map-reading extracts salient information about navigational behaviours from the overlooked wealth of pre-existing, abstract and inaccurate maps, ranging from floor-plans to sketches. We evaluate the performance of SAMs for navigation, by building and deploying a behavioural navigation stack on a quadrupedal robot. Videos and more information is available at: https://scene-action-maps.github.io.

ICLR Conference 2023 Conference Paper

DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics

Siwei Chen 0003
Yiqing Xu
Cunjun Yu
Linfeng Li 0001
Xiao Ma 0006
Zhongwen Xu
David Hsu

Deformable object manipulation (DOM) is a long-standing challenge in robotics and has attracted significant interest recently. This paper presents DaXBench, a differentiable simulation framework for DOM. While existing work often focuses on a specific type of deformable objects, DaXBench supports fluid, rope, cloth ...; it provides a general-purpose benchmark to evaluate widely different DOM methods, including planning, imitation learning, and reinforcement learning. DaXBench combines recent advances in deformable object simulation with JAX, a high-performance computational framework. All DOM tasks in DaXBench are wrapped with the OpenAI Gym API for easy integration with DOM algorithms. We hope that DaXBench provides to the research community a comprehensive, standardized benchmark and a valuable tool to support the development and evaluation of new DOM methods. The code and video are available online.

ICRA Conference 2023 Conference Paper

Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement

Zirui Zhao
Wee Sun Lee
David Hsu

We present a new method, PARsing And visual GrOuNding (PARAGON), for grounding natural language in object placement tasks. Natural language generally describes objects and spatial relations with compositionality and ambiguity, two major obstacles to effective language grounding. For compositionality, Paragon parses a language instruction into an object-centric graph representation to ground objects individually. For ambiguity, Paragon uses a novel particle-based graph neural network to reason about object placements with uncertainty. Essentially, Paragon integrates a parsing algorithm into a probabilistic, data-driven learning framework. It is fully differentiable and trained end-to-end from data for robustness against complex, ambiguous language input.

NeurIPS Conference 2023 Conference Paper

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning

Zirui Zhao
Wee Sun Lee
David Hsu

Large-scale task planning is a major challenge. Recent work exploits large language models (LLMs) directly as a policy and shows surprisingly interesting results. This paper shows that LLMs provide a commonsense model of the world in addition to a policy that acts on it. The world model and the policy can be combined in a search algorithm, such as Monte Carlo Tree Search (MCTS), to scale up task planning. In our new LLM-MCTS algorithm, the LLM-induced world model provides a commonsense prior belief for MCTS to achieve effective reasoning; the LLM-induced policy acts as a heuristic to guide the search, vastly improving search efficiency. Experiments show that LLM-MCTS outperforms both MCTS alone and policies induced by LLMs (GPT2 and GPT3. 5) by a wide margin, for complex, novel tasks. Further experiments and analyses on multiple tasks -- multiplication, travel planning, object rearrangement -- suggest minimum description length (MDL) as a general guiding principle: if the description length of the world model is substantially smaller than that of the policy, using LLM as a world model for model-based planning is likely better than using LLM solely as a policy.

NeurIPS Conference 2023 Conference Paper

What Truly Matters in Trajectory Prediction for Autonomous Driving?

Tran Phong
Haoran Wu
Cunjun Yu
Panpan Cai
Sifa Zheng
David Hsu

Trajectory prediction plays a vital role in the performance of autonomous driving systems, and prediction accuracy, such as average displacement error (ADE) or final displacement error (FDE), is widely used as a performance metric. However, a significant disparity exists between the accuracy of predictors on fixed datasets and driving performance when the predictors are used downstream for vehicle control, because of a dynamics gap. In the real world, the prediction algorithm influences the behavior of the ego vehicle, which, in turn, influences the behaviors of other vehicles nearby. This interaction results in predictor-specific dynamics that directly impacts prediction results. In fixed datasets, since other vehicles' responses are predetermined, this interaction effect is lost, leading to a significant dynamics gap. This paper studies the overlooked significance of this dynamics gap. We also examine several other factors contributing to the disparity between prediction performance and driving performance. The findings highlight the trade-off between the predictor's computational efficiency and prediction accuracy in determining real-world driving performance. In summary, an interactive, task-driven evaluation protocol for trajectory prediction is crucial to capture its effectiveness for autonomous driving. Source code along with experimental settings is available online (https: //whatmatters23. github. io/).

ICRA Conference 2022 Conference Paper

Deep Visual Navigation under Partial Observability

Bo Ai 0004
Wei Gao
Vinay
David Hsu

How can a robot navigate successfully in rich and diverse environments, indoors or outdoors, along office corridors or trails on the grassland, on the flat ground or the staircase? To this end, this work aims to address three challenges: (i) complex visual observations, (ii) partial observability of local visual sensing, and (iii) multimodal robot behaviors conditioned on both the local environment and the global navigation objective. We propose to train a neural network (NN) controller for local navigation via imitation learning. To tackle complex visual observations, we extract multi-scale spatial representations through CNNs. To tackle partial observability, we aggregate multi-scale spatial information over time and encode it in LSTMs. To learn multimodal behaviors, we use a separate memory module for each behavior mode. Importantly, we integrate the multiple neural network modules into a unified controller that achieves robust performance for visual navigation in complex, partially observable environments. We implemented the controller on the quadrupedal Spot robot and evaluated it on three challenging tasks: adversarial pedestrian avoidance, blind-spot obstacle avoidance, and elevator riding. The experiments show that the proposed NN architecture significantly improves navigation performance.

ICRA Conference 2022 Conference Paper

Learning Latent Graph Dynamics for Visual Manipulation of Deformable Objects

Xiao Ma 0006
David Hsu
Wee Sun Lee

Manipulating deformable objects, such as ropes and clothing, is a long-standing challenge in robotics, because of their large degrees of freedom, complex non-linear dynamics, and self-occlusion in visual perception. The key difficulty is a suitable representation, rich enough to capture the object shape, dynamics for manipulation and yet simple enough to be estimated reliably from visual observations. This work aims to learn latent Graph dynamics for DefOrmable Object Manipulation (G-DOOM). G-DOOM approximates a deformable object as a sparse set of interacting keypoints, which are extracted automatically from images via unsupervised learning. It learns a graph neural network that captures abstractly the geometry and the interaction dynamics of the keypoints. To handle object self-occlusion, G-DOOM uses a recurrent neural network to track the keypoints over time and condition their interactions on the history. We then train the resulting recurrent graph dynamics model through contrastive learning in a high-fidelity simulator. For manipulation planning, G-DOOM reasons explicitly about the learned dynamics model through model-predictive control applied at each keypoint. Preliminary experiments of G-DOOM on a set of challenging rope and cloth manipulation tasks indicate strong performance, compared with state-of-the-art methods. Although trained in a simulator, G-DOOM transfers directly to a real robot for both rope and cloth manipulation 1 1 Demo video available online at https://youtu.be/oCfbNMx2sQI.

NeurIPS Conference 2022 Conference Paper

Receding Horizon Inverse Reinforcement Learning

Yiqing Xu
Wei Gao
David Hsu

Inverse reinforcement learning (IRL) seeks to infer a cost function that explains the underlying goals and preferences of expert demonstrations. This paper presents Receding Horizon Inverse Reinforcement Learning (RHIRL), a new IRL algorithm for high-dimensional, noisy, continuous systems with black-box dynamic models. RHIRL addresses two key challenges of IRL: scalability and robustness. To handle high-dimensional continuous systems, RHIRL matches the induced optimal trajectories with expert demonstrations locally in a receding horizon manner and stitches'' together the local solutions to learn the cost; it thereby avoids the curse of dimensionality''. This contrasts sharply with earlier algorithms that match with expert demonstrations globally over the entire high-dimensional state space. To be robust against imperfect expert demonstrations and control noise, RHIRL learns a state-dependent cost function ``disentangled'' from system dynamics under mild conditions. Experiments on benchmark tasks show that RHIRL outperforms several leading IRL algorithms in most instances. We also prove that the cumulative error of RHIRL grows linearly with the task duration.

IJCAI Conference 2021 Conference Paper

Hindsight Trust Region Policy Optimization

Hanbo Zhang
Site Bai
Xuguang Lan
David Hsu
Nanning Zheng

Reinforcement Learning (RL) with sparse rewards is a major challenge. We pro- pose Hindsight Trust Region Policy Optimization (HTRPO), a new RL algorithm that extends the highly successful TRPO algorithm with hindsight to tackle the challenge of sparse rewards. Hindsight refers to the algorithm’s ability to learn from information across goals, including past goals not intended for the current task. We derive the hindsight form of TRPO, together with QKL, a quadratic approximation to the KL divergence constraint on the trust region. QKL reduces variance in KL divergence estimation and improves stability in policy updates. We show that HTRPO has similar convergence property as TRPO. We also present Hindsight Goal Filtering (HGF), which further improves the learning performance for suitable tasks. HTRPO has been evaluated on various sparse-reward tasks, including Atari games and simulated robot control. Experimental results show that HTRPO consistently outperforms TRPO, as well as HPG, a state-of-the-art policy 14 gradient algorithm for RL with sparse rewards.

PDF Details DOI

ICRA Conference 2021 Conference Paper

Interactive Planning for Autonomous Urban Driving in Adversarial Scenarios

Yuanfu Luo
Malika Meghjani
Qi Heng Ho
David Hsu
Daniela Rus

Autonomous urban driving among human-driven cars requires a holistic understanding of road rules, driver intents and driving styles. This is challenging as a short-term, single instance, driver intent of lane change may not correspond to their driving styles for a longer duration. This paper presents an interactive behavior planner which accounts for road context, short-term driver intent, and long-term driving style to infer beliefs over the latent states of surrounding vehicles. We use a specialized Partially Observable Markov Decision Process to provide risk-averse decisions. Specifically, we consider adversarial driving scenarios caused by irrational drivers to validate the robustness of our proposed interactive behavior planner in simulation as well as on a full-size self-driving car. Our experimental results show that our algorithm enables safer and more travel time-efficient autonomous driving compared to baselines even in adversarial scenarios.

ICLR Conference 2020 Conference Paper

Discriminative Particle Filter Reinforcement Learning for Complex Partial observations

Xiao Ma 0006
Péter Karkus
David Hsu
Wee Sun Lee
Nan Ye

Deep reinforcement learning is successful in decision making for sophisticated games, such as Atari, Go, etc. However, real-world decision making often requires reasoning with partial information extracted from complex visual observations. This paper presents Discriminative Particle Filter Reinforcement Learning (DPFRL), a new reinforcement learning framework for complex partial observations. DPFRL encodes a differentiable particle filter in the neural network policy for explicit reasoning with partial observations over time. The particle filter maintains a belief using learned discriminative update, which is trained end-to-end for decision making. We show that using the discriminative update instead of standard generative models results in significantly improved performance, especially for tasks with complex visual observations, because they circumvent the difficulty of modeling complex observations that are irrelevant to decision making. In addition, to extract features from the particle belief, we propose a new type of belief feature based on the moment generating function. DPFRL outperforms state-of-the-art POMDP RL models in Flickering Atari Games, an existing POMDP RL benchmark, and in Natural Flickering Atari Games, a new, more challenging POMDP RL benchmark introduced in this paper. Further, DPFRL performs well for visual navigation with real-world data in the Habitat environment.

AAAI Conference 2020 Conference Paper

Particle Filter Recurrent Neural Networks

Xiao Ma
Peter Karkus
David Hsu
Wee Sun Lee

Recurrent neural networks (RNNs) have been extraordinarily successful for prediction with sequential data. To tackle highly variable and multi-modal real-world data, we introduce Particle Filter Recurrent Neural Networks (PF-RNNs), a new RNN family that explicitly models uncertainty in its internal structure: while an RNN relies on a long, deterministic latent state vector, a PF-RNN maintains a latent state distribution, approximated as a set of particles. For effective learning, we provide a fully differentiable particle ﬁlter algorithm that updates the PF-RNN latent state distribution according to the Bayes rule. Experiments demonstrate that the proposed PF- RNNs outperform the corresponding standard gated RNNs on a synthetic robot localization dataset and 10 real-world sequence prediction datasets for text classiﬁcation, stock price prediction, etc.

ICRA Conference 2020 Conference Paper

SUMMIT: A Simulator for Urban Driving in Massive Mixed Traffic

Panpan Cai
Yiyuan Lee
Yuanfu Luo
David Hsu

Autonomous driving in an unregulated urban crowd is an outstanding challenge, especially, in the presence of many aggressive, high-speed traffic participants. This paper presents SUMMIT, a high-fidelity simulator that facilitates the development and testing of crowd-driving algorithms. By leveraging the open-source OpenStreetMap map database and a heterogeneous multi-agent motion prediction model developed in our earlier work, SUMMIT simulates dense, unregulated urban traffic for heterogeneous agents at any worldwide locations that OpenStreetMap supports. SUMMIT is built as an extension of CARLA and inherits from it the physics and visual realism for autonomous driving simulation. SUMMIT supports a wide range of applications, including perception, vehicle control and planning, and end-to-end learning. We provide a context-aware planner together with benchmark scenarios and show that SUMMIT generates complex, realistic traffic behaviors in challenging crowd-driving settings.

IROS Conference 2019 Conference Paper

Context and Intention Aware Planning for Urban Driving

Malika Meghjani
Yuanfu Luo
Qi Heng Ho
Panpan Cai
Shashwat Verma
Daniela Rus
David Hsu

We present a novel autonomous driving system which uses the road contextual information and intentions of other road users for urban driving. Unlike highways, urban environments require the drivers to follow traffic signs and signals while using their best judgment for anomalous situations. In such scenarios, a self-driving car needs to understand and take into account the uncertainties in the environment to plan and decide its action accordingly. Our planner models the intentions of the surrounding vehicles leveraging a neural network, and integrates the road contextual information to reduce its environment uncertainties and also speed up the decision making process. We validate our planner in simulation and in a real urban environment. Our experimental results show that integrating intention inference and road contextual information for prediction, planning and decision making help improve safety and efficiency of our autonomous driving system.

ICRA Conference 2019 Conference Paper

Factored Contextual Policy Search with Bayesian optimization

Robert Pinsler
Péter Karkus
Andras G. Kupcsik
David Hsu
Wee Sun Lee

Scarce data is a major challenge to scaling robot learning to truly complex tasks, as we need to generalize locally learned policies over different task contexts. Contextual policy search offers data-efficient learning and generalization by explicitly conditioning the policy on a parametric context space. In this paper, we further structure the contextual policy representation. We propose to factor contexts into two components: target contexts that describe the task objectives, e. g. target position for throwing a ball; and environment contexts that characterize the environment, e. g. initial position or mass of the ball. Our key observation is that experience can be directly generalized over target contexts. We show that this can be easily exploited in contextual policy search algorithms. In particular, we apply factorization to a Bayesian optimization approach to contextual policy search both in sampling-based and active learning settings. Our simulation results show faster learning and better generalization in various robotic domains. See our supplementary video: https://youtu.be/IIJTbBAOufDY.

ICRA Conference 2019 Conference Paper

Learning To Grasp Under Uncertainty Using POMDPs

Neha P. Garg
David Hsu
Wee Sun Lee

Robust object grasping under uncertainty is an essential capability of service robots. Many existing approaches rely on far-field sensors, such as cameras, to compute a grasp pose and perform open-loop grasp after placing gripper under the pose. This often fails as a result of sensing or environment uncertainty. This paper presents a principled, general and efficient approach to adaptive grasping, using both tactile and visual sensing as feedback. We first model adaptive grasping as a partially observable Markov decision process (POMDP), which handles uncertainty naturally. We solve the POMDP for sampled objects from a set, in order to generate data for learning. Finally, we train a grasp policy, represented as a deep recurrent neural network (RNN), in simulation through imitation learning. By combining model-based POMDP planning and imitation learning, the proposed approach achieves robustness under uncertainty, generalization over many objects, and fast execution. In particular, we show that modeling only a small sample of objects enables us to learn a robust strategy to grasp previously unseen objects of varying shapes and recover from failure over multiple steps. Experiments on the G3DB object dataset in simulation and a smaller object set with a real robot indicate promising results.

IJCAI Conference 2019 Conference Paper

Trust Dynamics and Transfer across Human-Robot Interaction Tasks: Bayesian and Neural Computational Models

Harold Soh
Shu Pan
Min Chen
David Hsu

This work contributes both experimental findings and novel computational human-robot trust models for multi-task settings. We describe Bayesian non-parametric and neural models, and compare their performance on data collected from real-world human-subjects study. Our study spans two distinct task domains: household tasks performed by a Fetch robot, and a virtual reality driving simulation of an autonomous vehicle performing a variety of maneuvers. We find that human trust changes and transfers across tasks in a structured manner based on perceived task characteristics. Our results suggest that task-dependent functional trust models capture human trust in robot capabilities more accurately, and trust transfer across tasks can be inferred to a good degree. We believe these models are key for enabling trust-based robot decision-making for natural human-robot interaction.

YNICL Journal 2018 Journal Article

Psychomotor slowing is associated with anomalies in baseline and prospective large scale neural networks in youth with epilepsy

Camille Garcia-Ramos
Kevin Dabbs
Elizabeth Meyerand
Vivek Prabhakaran
David Hsu
Jana Jones
Michael Seidenberg
Bruce Hermann

Purpose: Psychomotor slowing is a common but understudied cognitive impairment in epilepsy. Here we test the hypothesis that psychomotor slowing is associated with alterations in brain status reflected through analysis of large scale structural networks. We test the hypothesis that children with epilepsy with cognitive slowing at diagnosis will exhibit a cross-sectional and prospective pattern of altered brain development. Methods: A total of 78 children (age 8-18) with new/recent onset idiopathic epilepsies underwent 1.5 T MRI with network analysis of cortical, subcortical and cerebellar volumes. Children with epilepsy were divided into slow and fast psychomotor speed groups (adjusted for age, intelligence and epilepsy syndrome). Results: At baseline, slow-speed performers (SSP) presented lower modularity, lower global efficiency, higher transitivity, and lower number of hubs than fast-speed performers (FSP). Community structure in SSP exhibited poor association between cortical regions and both subcortical structures and the cerebellum while FSP presented well-defined communities. Prospectively, SSP displayed lower modularity but higher global efficiency and transitivity compared to FSP. Modules in FSP showed higher integration between and within themselves compared to SSP. SSP showed hubs mainly from frontal and temporal regions while in FSP were spread among frontal, temporal, parietal, subcortical areas and the left cerebellum. Implications: Results suggest the presence of widespread alterations in large scale networks between fast- and slow-speed children with recent onset epilepsies both at baseline and 2 years later. Slower processing speed appears to be a marker of abnormal brain development antecedent to epilepsy onset as well as brain development over the 2 years following diagnosis.

JAIR Journal 2017 Journal Article

DESPOT: Online POMDP Planning with Regularization

Nan Ye
Adhiraj Somani
David Hsu
Wee Sun Lee

The partially observable Markov decision process (POMDP) provides a principled general framework for planning under uncertainty, but solving POMDPs optimally is computationally intractable, due to the "curse of dimensionality" and the "curse of history". To overcome these challenges, we introduce the Determinized Sparse Partially Observable Tree (DESPOT), a sparse approximation of the standard belief tree, for online planning under uncertainty. A DESPOT focuses online planning on a set of randomly sampled scenarios and compactly captures the "execution" of all policies under these scenarios. We show that the best policy obtained from a DESPOT is near-optimal, with a regret bound that depends on the representation size of the optimal policy. Leveraging this result, we give an anytime online planning algorithm, which searches a DESPOT for a policy that optimizes a regularized objective function. Regularization balances the estimated value of a policy under the sampled scenarios and the policy size, thus avoiding overfitting. The algorithm demonstrates strong experimental results, compared with some of the best online POMDP algorithms available. It has also been incorporated into an autonomous driving system for real-time vehicle control. The source code for the algorithm is available online.

PDF Details DOI

NeurIPS Conference 2017 Conference Paper

QMDP-Net: Deep Learning for Planning under Partial Observability

Peter Karkus
David Hsu
Wee Sun Lee

This paper introduces the QMDP-net, a neural network architecture for planning under partial observability. The QMDP-net combines the strengths of model-free learning and model-based planning. It is a recurrent policy network, but it represents a policy for a parameterized set of tasks by connecting a model with a planning algorithm that solves the model, thus embedding the solution structure of planning in a network learning architecture. The QMDP-net is fully differentiable and allows for end-to-end training. We train a QMDP-net on different tasks so that it can generalize to new ones in the parameterized task set and “transfer” to other similar tasks beyond the set. In preliminary experiments, QMDP-net showed strong performance on several robotic tasks in simulation. Interestingly, while QMDP-net encodes the QMDP algorithm, it sometimes outperforms the QMDP algorithm in the experiments, as a result of end-to-end learning.

UAI Conference 2017 Conference Paper

Shortest Path under Uncertainty: Exploration versus Exploitation

Zhan Wei Lim
David Hsu
Wee Sun Lee

In the Canadian Traveler Problem (CTP), a traveler seeks a shortest path to a destination through a road network, but unknown to the traveler, some roads may be blocked. This paper studies the Bayesian CTP (BCTP), in which road states are correlated with known prior probabilities and the traveler can infer the states of an unseen road from past observations of other correlated roads. As generalized shortest-path problems, CTP and BCTP have important practical applications. We show that BCTP is NP-complete and give a polynomial-time approximation algorithm, Hedged Shortest Path under Determinization (HSPD), which approximates an optimal solution with a polylogarithmic factor. Preliminary experiments show promising results. HSPD outperforms a widely used greedy algorithm and a state-of-the-art UCT-based search algorithm for CTP, especially when significant exploration is required.

IROS Conference 2016 Conference Paper

Act to See and See to Act: POMDP planning for objects search in clutter

Jue Kun Li
David Hsu
Wee Sun Lee

We study the problem of objects search in clutter. In cluttered environments, partial occlusion among objects prevents vision systems from correctly recognizing objects. Hence, the agent needs to move objects around to gather information, which helps reduce uncertainty in perception. At the same time, the agent needs to minimize the efforts of moving objects to reduce the time required to complete the task. We model the problem as a Partially Observable Markov Decision Process (POMDP), formulating it as a problem of optimal decision making under uncertainty. By exploiting spatial constraints, we are able to adapt online POMDP planners to handle objects search problems with large state space and action space. Experiments show that the POMDP solution outperforms greedy approaches, especially in cases where multi-step manipulation is required.

ICRA Conference 2016 Conference Paper

POMDP-lite for robust robot planning under uncertainty

Min Chen 0018
Emilio Frazzoli
David Hsu
Wee Sun Lee

The partially observable Markov decision process (POMDP) provides a principled general model for planning under uncertainty. However, solving a general POMDP is computationally intractable in the worst case. This paper introduces POMDP-lite, a subclass of POMDPs in which the hidden state variables are constant or only change deterministically. We show that a POMDP-lite is equivalent to a set of fully observable Markov decision processes indexed by a hidden parameter and is useful for modeling a variety of interesting robotic tasks. We develop a simple model-based Bayesian reinforcement learning algorithm to solve POMDP-lite models. The algorithm performs well on large-scale POMDP-lite models with up to 1020 states and outperforms the state-of-the-art general-purpose POMDP algorithms. We further show that the algorithm is near-Bayesian-optimal under suitable conditions.

NeurIPS Conference 2015 Conference Paper

Adaptive Stochastic Optimization: From Sets to Paths

Zhan Wei Lim
David Hsu
Wee Sun Lee

Adaptive stochastic optimization optimizes an objective function adaptively under uncertainty. Adaptive stochastic optimization plays a crucial role in planning and learning under uncertainty, but is, unfortunately, computationally intractable in general. This paper introduces two conditions on the objective function, the marginal likelihood rate bound and the marginal likelihood bound, which enable efficient approximate solution of adaptive stochastic optimization. Several interesting classes of functions satisfy these conditions naturally, e. g. , the version space reduction function for hypothesis learning. We describe Recursive Adaptive Coverage (RAC), a new adaptive stochastic optimization algorithm that exploits these conditions, and apply it to two planning tasks under uncertainty. In constrast to the earlier submodular optimization approach, our algorithm applies to adaptive stochastic optimization algorithm over both sets and paths.

ICRA Conference 2015 Conference Paper

Intention-aware online POMDP planning for autonomous driving in a crowd

Haoyu Bai
Shaojun Cai
Nan Ye
David Hsu
Wee Sun Lee

This paper presents an intention-aware online planning approach for autonomous driving amid many pedestrians. To drive near pedestrians safely, efficiently, and smoothly, autonomous vehicles must estimate unknown pedestrian intentions and hedge against the uncertainty in intention estimates in order to choose actions that are effective and robust. A key feature of our approach is to use the partially observable Markov decision process (POMDP) for systematic, robust decision making under uncertainty. Although there are concerns about the potentially high computational complexity of POMDP planning, experiments show that our POMDP-based planner runs in near real time, at 3 Hz, on a robot golf cart in a complex, dynamic environment. This indicates that POMDP planning is improving fast in computational efficiency and becoming increasingly practical as a tool for robot planning under uncertainty.

ICAPS Conference 2015 Conference Paper

PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces

Zongzhang Zhang
David Hsu
Wee Sun Lee
Zhan Wei Lim
Aijun Bai

Trial-based asynchronous value iteration algorithms for large Partially Observable Markov Decision Processes (POMDPs), such as HSVI2, FSVI and SARSOP, have made impressive progress in the past decade. In the forward exploration phase of these algorithms, only the outcome that has the highest potential impact is searched. This paper provides a novel approach, called Palm LEAf SEarch (PLEASE), which allows the selection of more than one outcome when their potential impacts are close to the highest one. Compared with existing trial-based algorithms, PLEASE can save considerable time to propagate the bound improvements of beliefs in deep levels of the search tree to the root belief because of fewer point-based value backups. Experiments show that PLEASE scales up SARSOP, one of the fastest algorithms, by orders of magnitude on some POMDP tasks with large observation spaces.

SoCS Conference 2015 Conference Paper

PLEASE: Palm Leaf Search for POMDPs with Large Observation Spaces

Zongzhang Zhang
David Hsu
Wee Sun Lee
Zhan Wei Lim
Aijun Bai

This paper provides a novel POMDP planning method, called Palm LEAf SEarch (PLEASE), which allows the selection of more than one outcome when their potential impacts are close to the highest one during its forward exploration. Compared with existing trial-based algorithms, PLEASE can save considerable time to propagate the bound improvements of beliefs in deep levels of the search tree to the root belief because of fewer backup operations. Experiments showed that PLEASE scales up SARSOP, one of the fastest algorithms, by orders of magnitude on some POMDP tasks with large observation spaces.

IROS Conference 2015 Conference Paper

POMDP to the Rescue: Boosting Performance for Robocup Rescue

Kegui Wu
Wee Sun Lee
David Hsu

Disaster response is one of the most critical social issues and introduces quite a few research themes for the AI planning area. Robocup Rescue provides a platform to simulate the rescue process in a city when an earthquake happens. Existing methods consist of multi-agent methods that use greedy heuristics. These methods scale to large maps but suffer from volatile performance under different scenarios. In this work, we propose a planning framework to boost the performance on Robocup Rescue given several policies from the competition to be used as components. More specifically, we use an online POMDP algorithm with macro-actions and restrict it to plan within the space of tasks performed by the agents in the component policies at each time instance. Since the action space contains macro-actions of the component policies, the method is guaranteed to perform at least as well as the best component policy, and possibly better, if sufficient computation is provided. On the other hand, the restriction of the tasks to those suggested by component policies reduces the computational complexity of planning and allows the planning method to be practically applied. Experiment results show that our planner generates better performance than the best component policy for some scenarios and gives performance comparable to the best component policy for the rest.

IROS Conference 2015 Conference Paper

Towards autonomous navigation of unsignalized intersections under uncertainty of human driver intent

Volkan Sezer
Tirthankar Bandyopadhyay
Daniela Rus
Emilio Frazzoli
David Hsu

In a mixed environment of autonomous driverless vehicles and human driven vehicles operating on the same road, identifying intentions of human drivers and interacting with them in a compliant and responsible manner becomes a challenging problem for the driverless vehicles. In this paper, the problem of vehicle interaction at an intersection merging scenario is formulated as an Intention-Aware motion planning problem using the tools from Mixed Observability Markov Decision Process (MOMDP). We utilize the tools from recent intention aware planning framework to demonstrate a merging behavior in the presence of human drivers by trying to infer and act according to the intentions of the human drivers. A driver behavior model for T-junction intersections is developed in order to calculate the probabilistic state transition functions of the MOMDP model. With proposed solution, it is demonstrated that using intention aware planning improves performance in comparison to present time to merge approach by lowering accident probability and intersection navigation duration. The proposed method is tested on a real autonomous vehicle (AV) in the presence of human driven vehicles to validate our approach.

ICML Conference 2014 Conference Paper

Covering Number for Efficient Heuristic-based POMDP Planning

Zongzhang Zhang
David Hsu
Wee Sun Lee

The difficulty of POMDP planning depends on the size of the search space involved. Heuristics are often used to reduce the search space size and improve computational efficiency; however, there are few theoretical bounds on their effectiveness. In this paper, we use the covering number to characterize the size of the search space reachable under heuristics and connect the complexity of POMDP planning to the effectiveness of heuristics. With insights from the theoretical analysis, we have developed a practical POMDP algorithm, Packing-Guided Value Iteration (PGVI). Empirically, PGVI is competitive with the state-of-the-art point-based POMDP algorithms on 65 small benchmark problems and outperforms them on 4 larger problems.

NeurIPS Conference 2013 Conference Paper

DESPOT: Online POMDP Planning with Regularization

Adhiraj Somani
Nan Ye
David Hsu
Wee Sun Lee

POMDPs provide a principled framework for planning under uncertainty, but are computationally intractable, due to the “curse of dimensionality” and the “curse of history”. This paper presents an online lookahead search algorithm that alleviates these difficulties by limiting the search to a set of sampled scenarios. The execution of all policies on the sampled scenarios is summarized using a Determinized Sparse Partially Observable Tree (DESPOT), which is a sparsely sampled belief tree. Our algorithm, named Regularized DESPOT (R-DESPOT), searches the DESPOT for a policy that optimally balances the size of the policy and the accuracy on its value estimate obtained through sampling. We give an output-sensitive performance bound for all policies derived from the DESPOT, and show that R-DESPOT works well if a small optimal policy exists. We also give an anytime approximation to R-DESPOT. Experiments show strong results, compared with two of the fastest online POMDP algorithms.

ICRA Conference 2013 Conference Paper

Planning how to learn

Haoyu Bai
David Hsu
Wee Sun Lee

When a robot uses an imperfect system model to plan its actions, a key challenge is the exploration-exploitation trade-off between two sometimes conflicting objectives: (i) learning and improving the model, and (ii) immediate progress towards the goal, according to the current model. To address model uncertainty systematically, we propose to use Bayesian reinforcement learning and cast it as a partially observable Markov decision process (POMDP). We present a simple algorithm for offline POMDP planning in the continuous state space. Offline planning produces a POMDP policy, which can be executed efficiently online as a finite-state controller. This approach seamlessly integrates planning and learning: it incorporates learning objectives in the computed plan, which then enables the robot to learn nearly optimally online and reach the goal. We evaluated the approach in simulations on two distinct tasks, acrobot swing-up and autonomous vehicle navigation amidst pedestrians, and obtained interesting preliminary results.

IROS Conference 2012 Conference Paper

Autonomy for mobility on demand

Zhuang Jie Chong
Baoxing Qin
Tirthankar Bandyopadhyay
Tichakorn Wongpiromsarn
Brice Rebsamen
P. Dai
S. Kim
Marcelo H. Ang

We present an autonomous vehicle providing mobility-on-demand service in a crowded urban environment. The focus in developing the vehicle has been to attain autonomous driving with minimal sensing and low cost, off-the-shelf sensors to ensure the system's economic viability. The autonomous vehicle has successfully completed over 50 km handling numerous mobility requests during the course of multiple demonstrations. The video provides an overview of our approach, with special comments on our localization and perception modules showcasing one such request being serviced.

ICML Conference 2012 Conference Paper

Monte Carlo Bayesian Reinforcement Learning

Yi Wang 0006
Kok Sung Won
David Hsu
Wee Sun Lee

TCS Journal 2011 Journal Article

Component-based construction of bio-pathway models: The parameter estimation problem

Geoffrey Koh
David Hsu
P.S. Thiagarajan

Constructing and analyzing large biological pathway models is a significant challenge. We propose a general approach that exploits the structure of a pathway to identify pathway components, constructs the component models, and finally assembles the component models into a global pathway model. Specifically, we apply this approach to pathway parameter estimation, a main step in pathway model construction. A large biological pathway often involves many unknown parameters and the resulting high-dimensional search space poses a major computational difficulty. By exploiting the structure of a pathway and the distribution of available experimental data over the pathway, we decompose a pathway into components and perform parameter estimation for each component. However, some parameters may belong to multiple components. Independent parameter estimates from different components may be in conflict for such parameters. To reconcile these conflicts, we represent each component as a factor graph, a standard probabilistic graphical model. We then combine the resulting factor graphs and use a probabilistic inference technique called belief propagation to obtain the maximally likely parameter values that are globally consistent. We validate our approach on a synthetic pathway model based on the Akt-MAPK signaling pathways. The results indicate that the approach can potentially scale up to large pathway models.

NeurIPS Conference 2011 Conference Paper

Monte Carlo Value Iteration with Macro-Actions

Zhan Lim
Lee Sun
David Hsu

POMDP planning faces two major computational challenges: large state spaces and long planning horizons. The recently introduced Monte Carlo Value Iteration (MCVI) can tackle POMDPs with very large discrete state spaces or continuous state spaces, but its performance degrades when faced with long planning horizons. This paper presents Macro-MCVI, which extends MCVI by exploiting macro-actions for temporal abstraction. We provide sufficient conditions for Macro-MCVI to inherit the good theoretical properties of MCVI. Macro-MCVI does not require explicit construction of probabilistic models for macro-actions and is thus easy to apply in practice. Experiments show that Macro-MCVI substantially improves the performance of MCVI with suitable macro-actions.

TCS Journal 2011 Journal Article

Probabilistic approximations of ODEs based bio-pathway dynamics

Bing Liu
David Hsu
P.S. Thiagarajan

Bio-chemical networks are often modeled as systems of ordinary differential equations (ODEs). Such systems will not admit closed form solutions and hence numerical simulations will have to be used to perform analyses. However, the number of simulations required to carry out tasks such as parameter estimation can become very large. To get around this, we propose a discrete probabilistic approximation of the ODEs dynamics. We do so by discretizing the value and the time domain and assuming a distribution of initial states w. r. t. the discretization. Then we sample a representative set of initial states according to the assumed initial distribution and generate a corresponding set of trajectories through numerical simulations. Finally, using the structure of the signaling pathway we encode these trajectories compactly as a dynamic Bayesian network. This approximation of the signaling pathway dynamics has several advantages. First, the discretized nature of the approximation helps to bridge the gap between the accuracy of the results obtained by ODE simulation and the limited precision of experimental data used for model construction and verification. Second and more importantly, many interesting pathway properties can be analyzed efficiently through standard Bayesian inference techniques instead of resorting to a large number of ODE simulations. We have tested our method on ODE models of the EGF-NGF signaling pathway [1] and the segmentation clock pathway [2]. The results are very promising in terms of accuracy and efficiency.

AAAI Conference 2010 Conference Paper

Structured Parameter Elicitation

Li Ling Ko
David Hsu
Wee Sun Lee
Sylvie Ong

The behavior of a complex system often depends on parameters whose values are unknown in advance. To operate effectively, an autonomous agent must actively gather information on the parameter values while progressing towards its goal. We call this problem parameter elicitation. Partially observable Markov decision processes (POMDPs) provide a principled framework for such uncertainty planning tasks, but they suffer from high computational complexity. However, POMDPs for parameter elicitation often possess special structural properties, specifically, factorization and symmetry. This work identifies these properties and exploits them for efficient solution through a factored belief representation. The experimental results show that our new POMDP solvers outperform SARSOP and MOMDP, two of the fastest general-purpose POMDP solvers available, and can handle significantly larger problems.

ICRA Conference 2008 Conference Paper

A point-based POMDP planner for target tracking

David Hsu
Wee Sun Lee
Nan Rong

Target tracking has two variants that are often studied independently with different approaches: target searching requires a robot to find a target initially not visible, and target following requires a robot to maintain visibility on a target initially visible. In this work, we use a partially observable Markov decision process (POMDP) to build a single model that unifies target searching and target following. The POMDP solution exhibits interesting tracking behaviors, such as anticipatory moves that exploit target dynamics, informationgathering moves that reduce target position uncertainty, and energy-conserving actions that allow the target to get out of sight, but do not compromise long-term tracking performance. To overcome the high computational complexity of solving POMDPs, we have developed SARSOP, a new point-based POMDP algorithm based on successively approximating the space reachable under optimal policies. Experimental results show that SARSOP is competitive with the fastest existing pointbased algorithm on many standard test problems and faster by many times on some.

NeurIPS Conference 2007 Conference Paper

What makes some POMDP problems easy to approximate?

Wee Lee
Nan Rong
David Hsu

Point-based algorithms have been surprisingly successful in computing approx- imately optimal solutions for partially observable Markov decision processes (POMDPs) in high dimensional belief spaces. In this work, we seek to understand the belief-space properties that allow some POMDP problems to be approximated efﬁciently and thus help to explain the point-based algorithms’ success often ob- served in the experiments. We show that an approximately optimal POMDP so- lution can be computed in time polynomial in the covering number of a reachable belief space, which is the subset of the belief space reachable from a given belief point. We also show that under the weaker condition of having a small covering number for an optimal reachable space, which is the subset of the belief space reachable under an optimal policy, computing an approximately optimal solution is NP-hard. However, given a suitable set of points that “cover” an optimal reach- able space well, an approximate solution can be computed in polynomial time. The covering number highlights several interesting properties that reduce the com- plexity of POMDP planning in practice, e. g. , fully observed state variables, beliefs with sparse support, smooth beliefs, and circulant state-transition matrices.

ICRA Conference 2006 Conference Paper

A Greedy Strategy for Tracking a Locally Predictable Target among Obstacles

Tirthankar Bandyopadhyay
Yuanping Li
Marcelo H. Ang
David Hsu

Target tracking among obstacles is an interesting class of motion planning problems that combine the usual motion constraints with robot sensors' visibility constraints. In this paper, we introduce the notion of vantage time and use it to formulate a risk function that evaluates the robot's advantage in maintaining the visibility constraint against the target. Local minimization of the risk function leads to a greedy tracking strategy. We also use simple velocity prediction on the target to further improve tracking performance. We compared our new strategy with earlier work in extensive simulation experiments and obtained much improved results

ICRA Conference 2006 Conference Paper

Multi-level Free-space Dilation for Sampling narrow Passages in PRM Planning

David Hsu
Gildardo Sánchez-Ante
Ho-Lun Cheng
Jean-Claude Latombe

Free-space dilation is an effective approach for narrow passage sampling, a well-recognized difficulty in probabilistic roadmap (PRM) planning. Key to this approach are methods for dilating the free space and for determining the amount of dilation needed. This paper presents a new method of dilation by shrinking the geometric models of robots and obstacles. Compared with existing work, the new method is more efficient in both running time and memory usage. It is also integrated with collision checking, a key operation in PRM planning. The efficiency of the dilation method enables a new PRM planner which quickly constructs a series of dilated free spaces and automatically determine the amount of dilation needed. Experiments show that both the dilation method and the planner work well in complex geometric environments. In particular, the planner reliably solved the most difficult version of the alpha puzzle, a benchmark test for PRM planners

ICRA Conference 2005 Conference Paper

Hybrid PRM Sampling with a Cost-Sensitive Adaptive Strategy

David Hsu
Gildardo Sánchez-Ante
Zheng Sun

A number of advanced sampling strategies have been proposed in recent years to address the narrow passage problem for probabilistic roadmap (PRM) planning. These sampling strategies all have unique strengths, but none of them solves the problem completely. In this paper, we present a general and systematic approach for adaptively combining multiple sampling strategies so that their individual strengths are preserved. We have performed experiments with this approach on robots with up to 12 degrees of freedom in complex 3-D environments. Experiments show that although the performance of individual sampling strategies varies across different environments, the adaptive hybrid sampling strategies constructed with this approach perform consistently well in all environments. Further, we show that, under reasonable assumptions, the adaptive strategies are provably competitive against all individual strategies used.

IROS Conference 2004 Conference Paper

Workspace importance sampling for probabilistic roadmap planning

Hanna Kurniawati
David Hsu

Probabilistic roadmap (PRM) planners have been successful in path planning of robots with many degrees of freedom, but they behave poorly when a robot's configuration space contains narrow passages. This paper presents workspace importance sampling (WIS), a new sampling strategy for PRM planning. Our main idea is to use geometric information from a robot's workspace as "importance" values to guide sampling in the corresponding configuration space. By doing so, WIS increases the sampling density in narrow passages and decreases the sampling density in wide-open regions. We tested the new planner on rigid-body and articulated robots in 2-D and 3-D environments. Experimental results show that WIS improves the planner's performance for path planning problems with narrow passages.

ICRA Conference 2003 Conference Paper

The bridge test for sampling narrow passages with probabilistic roadmap planners

David Hsu
Tingting Jiang 0001
John H. Reif
Zheng Sun 0002

Probabilistic roadmap (PRM) planners have been successful in path planning of robots with many degrees of freedom, but narrow passages in a robot's configuration space create significant difficulty for PRM planners. This paper presents a hybrid sampling strategy in the PRM framework for finding paths through narrow passages. A key ingredient of the new strategy is the bridge test, which boosts the sampling density inside narrow passages. The bridge test relies on simple tests of local geometry and can be implemented efficiently in high-dimensional configuration spaces. The strengths of the bridge test and uniform sampling complement each other naturally and are combined to generate the final hybrid sampling strategy. Our planner was tested on point robots and articulated robots in planar workspaces. Preliminary experiments show that the hybrid sampling strategy enables relatively small roadmaps to reliably capture the connectivity of configuration spaces with difficult narrow passages.

NeurIPS Conference 2002 Conference Paper

Adaptive Quantization and Density Estimation in Silicon

David Hsu
Seth Bridges
Miguel Figueroa
Chris Diorio

We present the bump mixture model, a statistical model for analog data where the probabilistic semantics, inference, and learning rules derive from low-level transistor behavior. The bump mixture model relies on translinear circuits to perform probabilistic infer- ence, and floating-gate devices to perform adaptation. This system is low power, asynchronous, and fully parallel, and supports vari- ous on-chip learning algorithms. In addition, the mixture model can perform several tasks such as probability estimation, vector quanti- zation, classification, and clustering. We tested a fabricated system on clustering, quantization, and classification of handwritten digits and show performance comparable to the E-M algorithm on mix- tures of Gaussians.

NeurIPS Conference 2002 Conference Paper

Field-Programmable Learning Arrays

Seth Bridges
Miguel Figueroa
Chris Diorio
David Hsu

This paper introduces the Field-Programmable Learning Array, a new paradigm for rapid prototyping of learning primitives and machine- learning algorithms in silicon. The FPLA is a mixed-signal counterpart to the all-digital Field-Programmable Gate Array in that it enables rapid prototyping of algorithms in hardware. Unlike the FPGA, the FPLA is targeted directly for machine learning by providing local, parallel, on- line analog learning using ﬂoating-gate MOS synapse transistors. We present a prototype FPLA chip comprising an array of reconﬁgurable computational blocks and local interconnect. We demonstrate the via- bility of this architecture by mapping several learning circuits onto the prototype chip.

ICRA Conference 2001 Conference Paper

Disconnection Proofs for Motion Planning

Julien Basch
Leonidas J. Guibas
David Hsu
An Thai Nguyen

Probabilistic road-map (PRM) planners have shown great promise in attacking previously infeasible motion planning problems with many degrees of freedom. Yet when such a planner fails to find a path, it is not clear that no path exists, or that the planner simply did not sample adequately or intelligently the free part of the configuration space. We propose to attack the motion planning problem from the other end, focusing on disconnection proofs, or proofs showing that there exists no solution to the posed motion planning problem. Just as PRM planners avoid generating a complete description of the configuration space, our disconnection provers search for certain special classes of proofs that are compact and easy to find when the motion planning problem is 'obviously impossible, " avoiding complex geometric and combinatorial calculations. We demonstrate such a prover in action for a simple, yet still realistic, motion planning problem. When it fails, the prover suggests key milestones, or configurations of the robot that can then be passed on and used by a PRM planner. Thus by hitting the motion planning problem from both ends, we hope to resolve the existence of a path, except in truly delicate border-line situations.

NeurIPS Conference 2001 Conference Paper

Learning Spike-Based Correlations and Conditional Probabilities in Silicon

Aaron Shon
David Hsu
Chris Diorio

Abstract Unavailable

NeurIPS Conference 2000 Conference Paper

A Silicon Primitive for Competitive Learning

David Hsu
Miguel Figueroa
Chris Diorio

Competitive learning is a technique for training classification and clustering networks. We have designed and fabricated an 11- transistor primitive, that we term an automaximizing bump circuit, that implements competitive learning dynamics. The circuit per(cid: 173) forms a similarity computation, affords nonvolatile storage, and implements simultaneous local adaptation and computation. We show that our primitive is suitable for implementing competitive learning in VLSI, and demonstrate its effectiveness in a standard clustering task.

ICRA Conference 2000 Conference Paper

Kinodynamic Motion Planning Amidst Moving Obstacles

Robert Kindel
David Hsu
Jean-Claude Latombe
Stephen M. Rock

This paper presents a randomized motion planner for kinodynamic asteroid avoidance problems, in which a robot must avoid collision with moving obstacles under kinematic, dynamic constraints and reach a specified goal state. Inspired by probabilistic-roadmap (PRM) techniques, the planner samples the state x time space of a robot by picking control inputs at random in order to compute a roadmap that captures the connectivity of the space. However, the planner does not precompute a roadmap as most PRM planners do. Instead, for each planning query, it generates, on the fly, a small roadmap that connects the given initial and goal state. In contrast to PRM planners, the roadmap computed by our algorithm is a directed graph oriented along the time axis of the space. To verify the planner's effectiveness in practice, we tested it both in simulated environments containing many moving obstacles and on a real robot under strict dynamic constraints. The efficiency of the planner makes it possible for a robot to respond to a changing environment without knowing the motion of moving obstacles well in advance.

ICRA Conference 1997 Conference Paper

Path planning in expansive configuration spaces

David Hsu
Jean-Claude Latombe
Rajeev Motwani 0001

We introduce the notion of expansiveness to characterize a family of robot configuration spaces whose connectivity can be effectively captured by a roadmap of randomly-sampled milestones. The analysis of expansive configuration spaces has inspired us to develop a new randomized planning algorithm. This algorithm tries to sample only the portion of the configuration space that is relevant to the current query, avoiding the cost of precomputing a roadmap for the entire configuration space. Thus, it is well-suited for problems where a single query is submitted for a given environment. The algorithm has been implemented and successfully applied to complex assembly maintainability problems from the automotive industry.