Author name cluster

Andrew Silva

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

ICLR Conference 2025 Conference Paper

ReGen: Generative Robot Simulation via Inverse Design

Phat Nguyen
Tsun-Hsuan Wang
Zhang-Wei Hong
Erfan Aasi
Andrew Silva
Guy Rosman
Sertac Karaman
Daniela Rus

Simulation plays a key role in scaling robot learning and validating policies, but constructing simulations remains labor-intensive. In this paper, we introduce ReGen, a generative simulation framework that automates this process using inverse design. Given an agent's behavior (such as a motion trajectory or objective function) and its textual description, we infer the underlying scenarios and environments that could have caused the behavior. Our approach leverages large language models to construct and expand a graph that captures cause-and-effect relationships and relevant entities with properties in the environment, which is then processed to configure a robot simulation environment. Our approach supports (i) augmenting simulations based on ego-agent behaviors, (ii) controllable, counterfactual scenario generation, (iii) reasoning about agent cognition and mental states, and (iv) reasoning with distinct sensing modalities, such as braking due to faulty GPS signals. We demonstrate our method in autonomous driving and robot manipulation tasks, generating more diverse, complex simulated environments compared to existing simulations with high success rates, and enabling controllable generation for corner cases. This approach enhances the validation of robot policies and supports data or simulation augmentation, advancing scalable robot learning for improved generalization and robustness.

Details

JAAMAS Journal 2022 Journal Article

Multi-UAV planning for cooperative wildfire coverage and tracking with quality-of-service guarantees

Esmaeil Seraj
Andrew Silva
Matthew Gombolay

Abstract In recent years, teams of robot and Unmanned Aerial Vehicles (UAVs) have been commissioned by researchers to enable accurate, online wildfire coverage and tracking. While the majority of prior work focuses on the coordination and control of such multi-robot systems, to date, these UAV teams have not been given the ability to reason about a fire’s track (i. e. , location and propagation dynamics) to provide performance guarantee over a time horizon. Motivated by the problem of aerial wildfire monitoring, we propose a predictive framework which enables cooperation in multi-UAV teams towards collaborative field coverage and fire tracking with probabilistic performance guarantee. Our approach enables UAVs to infer the latent fire propagation dynamics for time-extended coordination in safety-critical conditions. We derive a set of novel, analytical temporal, and tracking-error bounds to enable the UAV-team to distribute their limited resources and cover the entire fire area according to the case-specific estimated states and provide a probabilistic performance guarantee. Our results are not limited to the aerial wildfire monitoring case-study and are generally applicable to problems, such as search-and-rescue, target tracking and border patrol. We evaluate our approach in simulation and provide demonstrations of the proposed framework on a physical multi-robot testbed to account for real robot dynamics and restrictions. Our quantitative evaluations validate the performance of our method accumulating $7. 5\times$ and $9. 0\times$ smaller tracking-error than state-of-the-art model-based and reinforcement learning benchmarks, respectively.

Details DOI

AAAI Conference 2021 Conference Paper

Encoding Human Domain Knowledge to Warm Start Reinforcement Learning

Andrew Silva
Matthew Gombolay

Deep reinforcement learning has been successful in a variety of tasks, such as game playing and robotic manipulation. However, attempting to learn tabula rasa disregards the logical structure of many domains as well as the wealth of readily available knowledge from domain experts that could help “warm start” the learning process. We present a novel reinforcement learning technique that allows for intelligent initialization of a neural network weights and architecture. Our approach permits the encoding domain knowledge directly into a neural decision tree, and improves upon that knowledge with policy gradient updates. We empirically validate our approach on two OpenAI Gym tasks and two modified StarCraft 2 tasks, showing that our novel architecture outperforms multilayer-perceptron and recurrent architectures. Our knowledge-based framework finds superior policies compared to imitation learning-based and prior knowledge-based approaches. Importantly, we demonstrate that our approach can be used by untrained humans to initially provide > 80% increase in expected reward relative to baselines prior to training (p 60% increase in expected reward after policy optimization (p = 0. 011).

PDF Details

NeurIPS Conference 2020 Conference Paper

Interpretable and Personalized Apprenticeship Scheduling: Learning Interpretable Scheduling Policies from Heterogeneous User Demonstrations

Rohan Paleja
Andrew Silva
Letian Chen
Matthew Gombolay

Resource scheduling and coordination is an NP-hard optimization requiring an efficient allocation of agents to a set of tasks with upper- and lower bound temporal and resource constraints. Due to the large-scale and dynamic nature of resource coordination in hospitals and factories, human domain experts manually plan and adjust schedules on the fly. To perform this job, domain experts leverage heterogeneous strategies and rules-of-thumb honed over years of apprenticeship. What is critically needed is the ability to extract this domain knowledge in a heterogeneous and interpretable apprenticeship learning framework to scale beyond the power of a single human expert, a necessity in safety-critical domains. We propose a personalized and interpretable apprenticeship scheduling algorithm that infers an interpretable representation of all human task demonstrators by extracting decision-making criteria via an inferred, personalized embedding non-parametric in the number of demonstrator types. We achieve near-perfect LfD accuracy in synthetic domains and 88. 22\% accuracy on a planning domain with real-world data, outperforming baselines. Finally, our user study showed our methodology produces more interpretable and easier-to-use models than neural networks ($p < 0. 05$).

PDF Details

AAMAS Conference 2019 Conference Paper

Unsupervised Role Discovery Using Temporal Observations of Agents

Andrew Silva
Sonia Chernova

Agent-based modeling of multi-agent systems has enormous potential with applications in modeling social, economic, medical and other application domains containing temporal data. We propose an unsupervised approach to discovering common roles by observing agents over time, allowing us to construct a role-based representation of multi-agent systems that aids in understanding and interpreting the state of the system. We validate our approach on both a soccer and a StarCraft dataset, and show that unsupervised role discovery through observation can provide meaningful insight into the state of a multi-agent system, aiding or even replacing game state data for interpretation or understanding of the system.

PDF