Author name cluster

Michael Lewis

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

1 author row

NeurIPS Conference 2024 Conference Paper

Language Grounded Multi-agent Reinforcement Learning with Human-interpretable Communication

Huao Li
Hossein N. Mahjoub
Behdad Chalaki
Vaishnav Tadiparthi
Kwonjoon Lee
Ehsan Moradi-Pari
Michael Lewis
Katia Sycara

Multi-Agent Reinforcement Learning (MARL) methods have shown promise in enabling agents to learn a shared communication protocol from scratch and accomplish challenging team tasks. However, the learned language is usually not interpretable to humans or other agents not co-trained together, limiting its applicability in ad-hoc teamwork scenarios. In this work, we propose a novel computational pipeline that aligns the communication space between MARL agents with an embedding space of human natural language by grounding agent communications on synthetic data generated by embodied Large Language Models (LLMs) in interactive teamwork scenarios. Our results demonstrate that introducing language grounding not only maintains task performance but also accelerates the emergence of communication. Furthermore, the learned communication protocols exhibit zero-shot generalization capabilities in ad-hoc teamwork scenarios with unseen teammates and novel task states. This work presents a significant step toward enabling effective communication and collaboration between artificial agents and humans in real-world teamwork settings.

PDF Details DOI

JBHI Journal 2021 Journal Article

A Vaccination Simulator for COVID-19: Effective and Sterilizing Immunization Cases

Aknur Karabay
Askat Kuzdeuov
Shyryn Ospanova
Michael Lewis
Huseyin Atakan Varol

In this work, we present a particle-based SEIR epidemic simulator as a tool to assess the impact of different vaccination strategies on viral propagation and to model sterilizing and effective immunization outcomes. The simulator includes modules to support contact tracing of the interactions amongst individuals and epidemiological testing of the general population. The particles are distinguished by age to represent more accurately the infection and mortality rates. The tool can be calibrated by region of interest and for different vaccination strategies to enable locality-sensitive virus mitigation policy measures and resource allocation. Moreover, the vaccination policy can be simulated based on the prioritization of certain age groups or randomly vaccinating individuals across all age groups. The results based on the experience of the province of Lecco, Italy, indicate that the simulator can evaluate vaccination strategies in a way that incorporates local circumstances of viral propagation and demographic susceptibilities. Further, the simulator accounts for modeling the distinction between sterilizing immunization, where immunized people are no longer contagious, and effective immunization, where the individuals can transmit the virus even after getting immunized. The parametric simulation results showed that the sterilizing-age-based vaccination scenario results in the least number of deaths. Furthermore, it revealed that older people should be vaccinated first to decrease the overall mortality rate. Also, the results showed that as the vaccination rate increases, the mortality rate between the scenarios shrinks.

Details DOI

NeurIPS Conference 2021 Conference Paper

Emergent Discrete Communication in Semantic Spaces

Mycal Tucker
Huao Li
Siddharth Agrawal
Dana Hughes
Katia Sycara
Michael Lewis
Julie A Shah

Neural agents trained in reinforcement learning settings can learn to communicate among themselves via discrete tokens, accomplishing as a team what agents would be unable to do alone. However, the current standard of using one-hot vectors as discrete communication tokens prevents agents from acquiring more desirable aspects of communication such as zero-shot understanding. Inspired by word embedding techniques from natural language processing, we propose neural agent architectures that enables them to communicate via discrete tokens derived from a learned, continuous space. We show in a decision theoretic framework that our technique optimizes communication over a wide range of scenarios, whereas one-hot tokens are only optimal under restrictive assumptions. In self-play experiments, we validate that our trained agents learn to cluster tokens in semantically-meaningful ways, allowing them communicate in noisy environments where other techniques fail. Lastly, we demonstrate both that agents using our method can effectively respond to novel human communication and that humans can understand unlabeled emergent agent communication, outperforming the use of one-hot communication.

PDF Details

JBHI Journal 2020 Journal Article

A Network-Based Stochastic Epidemic Simulator: Controlling COVID-19 With Region-Specific Policies

Askat Kuzdeuov
Daulet Baimukashev
Aknur Karabay
Bauyrzhan Ibragimov
Almas Mirzakhmetov
Mukhamet Nurpeiissov
Michael Lewis
Huseyin Atakan Varol

In this work, we present an open-source stochastic epidemic simulator calibrated with extant epidemic experience of COVID-19. The simulator models a country as a network representing each node as an administrative region. The transportation connections between the nodes are modeled as the edges of this network. Each node runs a Susceptible-Exposed-Infected-Recovered (SEIR) model and population transfer between the nodes is considered using the transportation networks which allows modeling of the geographic spread of the disease. The simulator incorporates information ranging from population demographics and mobility data to health care resource capacity, by region, with interactive controls of system variables to allow dynamic and interactive modeling of events. The single-node simulator was validated using the thoroughly reported data from Lombardy, Italy. Then, the epidemic situation in Kazakhstan as of 31 May 2020 was accurately recreated. Afterward, we simulated a number of scenarios for Kazakhstan with different sets of policies. We also demonstrate the effects of region-based policies such as transportation limitations between administrative units and the application of different policies for different regions based on the epidemic intensity and geographic location. The results show that the simulator can be used to estimate outcomes of policy options to inform deliberations on governmental interdiction policies.

Details DOI

AAMAS Conference 2019 Conference Paper

Trust-Aware Behavior Reflection for Robot Swarm Self-Healing

Rui Liu
Fan Jia
Wenhao Luo
Meghan Chandarana
Changjoo Nam
Michael Lewis
Katia Sycara

The deployment of robot swarms is influenced by real-world factors, such as motor issues, sensor failure, and wind disturbances. These factors cause the appearance of faulty robots. In a decentralized swarm, sharing incorrect information from faulty robots will lead to undesired swarm behaviors, such as swarm disconnection and incorrect heading directions. We envision a system where a human operator is exerting supervisory control over a remote swarm by indicating changes in trust to the swarm via a "trust-signal". By correcting faulty behaviors, trust between the human and the swarm is maintained to facilitate human-swarm cooperation. In this research, a trust-aware behavior reflection method – Trust-R – is designed based on a weighted mean subsequence reduced algorithm (WMSR). By using Trust-R, detected faulty behaviors are automatically corrected by the swarm in a decentralized fashion by referring to the motion status of their trusted neighbors and isolating failed robots from the others. Based on real-world scenarios, three types of robot faults – degraded performance caused by motor wear, abnormal motion caused by system uncertainty and motion deviation caused by an external disturbance such as wind – were simulated to test the effectiveness of Trust-R. Results show that Trust-R is effective in correcting swarm behaviors for swarm self-healing.

PDF

AAMAS Conference 2012 Conference Paper

A cognitive architecture for emergency response

Felipe Meneguzzi
Siddharth Mehrotra
James Tittle
Jean Oh
Nilanjan Chakraborty
Katia Sycara
Michael Lewis

Plan recognition, cognitive workload estimation and human assistance have been extensively studied in the AI and human factors communities, but have seldom been integrated and evaluated as complete systems. In this paper, we develop an assistant agent architecture integrating plan recognition, current and future user information needs, workload estimation and adaptive information presentation to aid an emergency response manager in making high quality decisions under time stress, while avoiding cognitive overload. We describe its main components as well as results for en experiment simulating various possible executions of the emergency response plans used in the real world, comparing reaction time of an assisted versus an unassisted human.

PDF

IS Journal 2009 Journal Article

Scaling Up Wide-Area-Search-Munition Teams

Michael Lewis
Katia Sycara
Paul Scerri

Wide area search munitions (WASMs) are a cross between an unmanned aerial vehicle (UAV) and a munition. The first of these high-concept munitions, the low-cost autonomous attack system, was a miniature, autonomous WASM capable of broad-area search, identification, and destruction of a range of mobile ground targets. The LoCAAS used a small turbojet engine capable of powering the vehicle for up to 30 minutes and laser radar (ladar) with automatic target recognition to identify potential targets. The original LoCAAS was a fire-and-forget munition designed to operate independently. It flew preprogrammed search patterns until it located a target or ran out of fuel.

Details DOI

AAMAS Conference 2008 Conference Paper

An Approach to Online Optimization of Heuristic Coordination Algorithms

Jumpol Polvichai
Paul Scerri
Michael Lewis

Due to computational intractability, large scale coordination algorithms are necessarily heuristic and hence require tuning for particular environments. In domains where characteristics of the environment vary dramatically from scenario to scenario, it is desirable to have automated techniques for appropriately configuring the coordination. This paper presents an approach that takes performance data from a simulator to train a stochastic neural network that concisely models the complex, probabilistic relationship between configurations, environments and performance metrics. The stochastic neural network is used as the core of a tool that allows rapid online or offline configuration of coordination algorithms to particular scenarios and user preferences. The overall system allows rapid adaptation of coordination, leading to better performance in new scenarios.

PDF

AAMAS Conference 2006 Conference Paper

Comparing Market and Token-Based Coordination

Yang Xu
Paul Scerri
Katia Sycara
Michael Lewis

AAMAS Conference 2006 Conference Paper

Scalable and Reliable Data Delivery in Mobile Ad Hoc Sensor Networks

Bin Yu
Paul Scerri
Katia Sycara
Yang Xu
Michael Lewis