Author name cluster

Adam Eck

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

UAI Conference 2025 Conference Paper

MOHITO: Multi-Agent Reinforcement Learning using Hypergraphs for Task-Open Systems

Gayathri Anil
Prashant Doshi
Daniel Redder
Adam Eck
Leen-Kiat Soh

Open agent systems are prevalent in the real world, where the sets of agents and tasks change over time. In this paper, we focus on task-open multi-agent systems, exemplified by applications such as ridesharing, where passengers (tasks) appear spontaneously over time and disappear if not attended to promptly. Task-open settings challenge us with an action space which changes dynamically. This renders existing reinforcement learning (RL) methods–intended for fixed state and action spaces–inapplicable. Whereas multi-task learning approaches learn policies generalized to multiple known and related tasks, they struggle to adapt to previously unseen tasks. Conversely, lifelong learning adapts to new tasks over time, but generally assumes that tasks come sequentially from a static and known distribution rather than simultaneously and unpredictably. We introduce a novel category of RL for addressing task openness, modeled using a task-open Markov game. Our approach, MOHITO, is a multi-agent actor-critic schema which represents knowledge about the relationships between agents and changing tasks and actions as dynamically evolving 3-uniform hypergraphs. As popular multi-agent RL testbeds do not exhibit task openness, we evaluate MOHITO on two realistic and naturally task-open domains to establish its efficacy and provide a benchmark for future work in this setting.

Details

AAAI Conference 2022 Conference Paper

An Algorithmic Introduction to Savings Circles

Rediet Abebe
Adam Eck
Christian Ikeokwu
Sam Taggart

Rotating savings and credit associations (roscas) are informal financial organizations common in settings where communities have reduced access to formal financial institutions. In a rosca, a fixed group of participants regularly contribute sums of money to a pot. This pot is then allocated periodically using lottery, aftermarket, or auction mechanisms. Roscas are empirically well-studied in economics. They are, however, challenging to study theoretically due to their dynamic nature. Typical economic analyses of roscas stop at coarse ordinal welfare comparisons to other credit allocation mechanisms, leaving much of roscas’ ubiquity unexplained. In this work, we take an algorithmic perspective on the study of roscas. Building on techniques from the price of anarchy literature, we present worst-case welfare approximation guarantees. We further experimentally compare the welfare of outcomes as key features of the environment vary. These cardinal welfare analyses further rationalize the prevalence of roscas. We conclude by discussing several other promising avenues.

PDF Details

UAI Conference 2022 Conference Paper

Decision-theoretic planning with communication in open multiagent systems

Anirudh Kakarlapudi
Gayathri Anil
Adam Eck
Prashant Doshi
Leen-Kiat Soh

In open multiagent systems, the set of agents operating in the environment changes over time and in ways that are nontrivial to predict. For example, if collaborative robots were tasked with fighting wildfires, they may run out of suppressants and be temporarily unavailable to assist their peers. Because an agent’s optimal action depends on the actions of others, each agent must not only predict the actions of its peers, but, before that, reason whether they are even present to perform an action. Addressing openness thus requires agents to model each other’s presence, which can be enhanced through agents communicating about their presence in the environment. At the same time, communicative acts can also incur costs (e. g. , consuming limited bandwidth), and thus an agent must tradeoff the benefits of enhanced coordination with the costs of communication. We present a new principled, decision-theoretic method in the context provided by the recent communicative interactive POMDP framework for planning in open agent settings that balances this tradeoff. Simulations of multiagent wildfire suppression problems demonstrate how communication can improve planning in open agent environments, as well as how agents tradeoff the benefits and costs of communication under different scenarios.

Details

AAAI Conference 2020 Conference Paper

Scalable Decision-Theoretic Planning in Open and Typed Multiagent Systems

Adam Eck
Maulik Shah
Prashant Doshi
Leen-Kiat Soh

In open agent systems, the set of agents that are cooperating or competing changes over time and in ways that are nontrivial to predict. For example, if collaborative robots were tasked with ﬁghting wildﬁres, they may run out of suppressants and be temporarily unavailable to assist their peers. We consider the problem of planning in these contexts with the additional challenges that the agents are unable to communicate with each other and that there are many of them. Because an agent’s optimal action depends on the actions of others, each agent must not only predict the actions of its peers, but, before that, reason whether they are even present to perform an action. Addressing openness thus requires agents to model each other’s presence, which becomes computationally intractable with high numbers of agents. We present a novel, principled, and scalable method in this context that enables an agent to reason about others’ presence in its shared environment and their actions. Our method extrapolates models of a few peers to the overall behavior of the many-agent system, and combines it with a generalization of Monte Carlo tree search to perform individual agent reasoning in manyagent open environments. Theoretical analyses establish the number of agents to model in order to achieve acceptable worst case bounds on extrapolation error, as well as regret bounds on the agent’s utility from modeling only some neighbors. Simulations of multiagent wildﬁre suppression problems demonstrate our approach’s efﬁcacy compared with alternative baselines.

PDF Details

AAMAS Conference 2016 Conference Paper

Collaborative Human Task Assignment for Open Systems (Extended Abstract)

Bin Chen
Adam Eck
Leen-Kiat Soh

Through gathering information, acting autonomously, learning, and behaving socially, intelligent agents provide useful interfaces between complex systems and human users. For example, agents can interact with people to discover their preferences, skills, and expertise, then find suitable tasks that exploit the users’ abilities. We describe modeling environmental openness and human learning in a multiagent system for a human collaborative task assignment problem.

PDF

UAI Conference 2016 Conference Paper

Individual Planning in Open and Typed Agent Systems

Muthukumaran Chandrasekaran
Adam Eck
Prashant Doshi
Leen-Kiat Soh

Open agent systems are multiagent systems in which one or more agents may leave the system at any time possibly resuming after some interval and in which new agents may also join. Planning in such systems becomes challenging in the absence of inter-agent communication because agents must predict if others have left the system or new agents are now present to decide on possibly choosing a different line of action. In this paper, we prioritize open systems where agents of differing types may leave and possibly reenter but new agents do not join. With the help of a realistic domain – wildfire suppression – we motivate the need for individual planning in open environments and present a first approach for robust decision-theoretic planning in such multiagent systems. Evaluations in domain simulations clearly demonstrate the improved performance compared to previous methods that disregard the openness.

Details

JAAMAS Journal 2015 Journal Article

Potential-based reward shaping for finite horizon online POMDP planning

Adam Eck
Leen-Kiat Soh
Daniel Kudenko

Abstract In this paper, we address the problem of suboptimal behavior during online partially observable Markov decision process (POMDP) planning caused by time constraints on planning. Taking inspiration from the related field of reinforcement learning (RL), our solution is to shape the agent’s reward function in order to lead the agent to large future rewards without having to spend as much time explicitly estimating cumulative future rewards, enabling the agent to save time to improve the breadth planning and build higher quality plans. Specifically, we extend potential-based reward shaping (PBRS) from RL to online POMDP planning. In our extension, information about belief states is added to the function optimized by the agent during planning. This information provides hints of where the agent might find high future rewards beyond its planning horizon, and thus achieve greater cumulative rewards. We develop novel potential functions measuring information useful to agent metareasoning in POMDPs (reflecting on agent knowledge and/or histories of experience with the environment), theoretically prove several important properties and benefits of using PBRS for online POMDP planning, and empirically demonstrate these results in a range of classic benchmark POMDP planning problems.

Details DOI

AAMAS Conference 2013 Conference Paper

Active Sensing in Complex Multiagent Environments

Adam Eck

In this research, we focus on active sensing solutions to address challenging properties in complex environments, such as uncertainty, partial observability, non-stationarity, and limited resources. We describe our ongoing contributions, focusing on sensing for both individual agents and cooperating teams. We also outline how we are applying our research to two real-world applications: personal assistants and intelligent survey systems.

PDF

AAMAS Conference 2013 Conference Paper

Dynamic Facts in Large Team Information Sharing (Extended Abstract)

Adam Eck
Leen-Kiat Soh

In this paper, we extend the large team information sharing problem to consider dynamic facts, where the value of facts about the environment being observed can change over time. Dynamic facts are challenging because the team must repeatedly converge to consistent, accurate beliefs over time, without necessarily knowing if or when the fact changes values. We discover an interesting, emergent phenomenon: institutional memory, where the team as a whole becomes stuck remembering outdated beliefs. We demonstrate that controlling the trust placed in new information from neighboring agents does not adequately control belief convergence with dynamic facts, which previously was shown to benefit the team when working with static facts.

PDF

AAMAS Conference 2013 Conference Paper

Potential-Based Reward Shaping for POMDPs

Adam Eck
Leen-Kiat Soh
Sam Devlin
Daniel Kudenko

We address the problem of suboptimal behavior caused by short horizons during online POMDP planning. Our solution extends potential-based reward shaping from the related field of reinforcement learning to online POMDP planning in order to improve planning without increasing the planning horizon. In our extension, information about the quality of belief states is added to the function optimized by the agent during planning. This information provides hints of where the agent might find high future rewards, and thus achieve greater cumulative rewards.

PDF

AAMAS Conference 2012 Conference Paper

Evaluating POMDP Rewards for Active Perception

Adam Eck
Leen-Kiat Soh

One popular approach to active perception is using POMDPs to maximize rewards received for sensing actions towards task accomplishment and/or continually refining the agent’s knowledge. Multiple types of reward functions have been proposed to achieve these goals: (1) state-based rewards which minimize sensing costs and maximize task rewards, (2) belief-based rewards which maximize belief state improvement, and (3) hybrid rewards combining the other two types. However, little attention has been paid to understanding the differences between these function types and their impact on agent sensing and task performance. In this paper, we begin to address this deficiency by providing (1) an intuitive comparison of the strengths and weaknesses of the various function types, and (2) an empirical evaluation of our comparison in a simulated active perception environment.

PDF

JAAMAS Journal 2012 Journal Article

Observer effect from stateful resources in agent sensing

Adam Eck
Leen-Kiat Soh

Abstract In many real-world applications of multi-agent systems, agent reasoning suffers from bounded rationality caused by both limited resources and limited knowledge. When agent sensing to overcome its knowledge limitations also requires resource use, the agent’s knowledge refinement is affected due to its inability to always sense when and as accurately as needed, further leading to poor decision making. In this paper, we consider what happens when sensing actions require the use of stateful resources, which we define as resources whose state-dependent behavior changes over time based on usage. Current literature addressing agent sensing with limited resources primarily investigates stateless resources, such as avoiding the use of too much time or energy during sensing. However, sensing itself can change the state of a resource, and thus its behavior, which affects both the information gathered and the resulting knowledge refinement. This produces a phenomenon where the sensing action can and will distort its own outcome (and potentially future outcomes), termed the Observer Effect (OE) after the similar phenomenon in the physical sciences. Under this effect, when deliberating about when and how to perform sensing that requires use of stateful resources, an agent faces a strategic tradeoff between satisfying the need for (1) knowledge refinement to support its reasoning, and (2) avoiding knowledge corruption due to distorted sensing outcomes. To address this tradeoff, we model sensing action selection as a partially observable Markov decision process where an agent optimizes knowledge refinement while considering the (possibly hidden) state of the resources used during sensing. In this model, the agent uses reinforcement learning to learn a controller for action selection, as well as how to predict expected knowledge refinement based on resource use during sensing. Our approach is unique from other bounded rationality and sensing research as we consider how to make decisions about sensing with stateful resources that produce side effects such as the OE, as opposed to simply using stateless resources with no such side effect. We evaluate our approach in a fully and partially observable agent mining simulation. The results demonstrate that considering resource state and the OE during sensing action selection through our approach (1) yielded better knowledge refinement, (2) appropriately balanced current and future refinement to avoid knowledge corruption, and (3) exploited the relationship (i. e. , high, positive correlation) between sensing and task performance to boost task performance through improved sensing. Further, our methodology also achieved good knowledge refinement even when the OE is not present, indicating that it can improve sensing performance in a wide variety of environments. Finally, our results also provide insights into the types and configurations of learning algorithms useful for learning within our methodology.

Details DOI

AAMAS Conference 2011 Conference Paper

Agent Sensing with Stateful Resources

Adam Eck
Leen-Kiat Soh

In the application of multi-agent systems to real-world problems, agents often suffer from bounded rationality where agent reasoning is limited by 1) a lack of knowledge about choices, and 2) a lack of resources required for reasoning. To overcome the former, the agent uses sensing to refine its knowledge. However, sensing can also require limited resources, leading to inaccurate environment modeling and poor decision making. In this paper, we consider a novel and difficult class of this problem where agents must use stateful resources during sensing, which we define as resources whose state-dependent behavior changes over time based on usage. Specifically, such sensing changes the state of a resource, and thus its behavior, producing a phenomenon where the sensing activity can and will distort its own outcome. We term this the Observer Effect after the similar phenomenon in the physical sciences. Given this effect, the agent faces a strategic tradeoff between satisfying the need for 1) knowledge refinement, and 2) avoiding corruption of knowledge due to distorted sensing outcomes. To address this tradeoff, we use active perception to select sensing activities and model activity selection as a Markov decision process (MDP) solved through reinforcement learning where an agent optimizes knowledge refinement while considering the state of the resource used during sensing.

PDF