Agent-Centric Actor-Critic for Asynchronous Multi-Agent Reinforcement Learning

Whiyoung Jung; Sunghoon Hong; Deunsol Yoon; Kanghoon Lee; Woohyung Lim

Back to ICML

ICML 2025

Agent-Centric Actor-Critic for Asynchronous Multi-Agent Reinforcement Learning

Conference Paper Accept (poster) Artificial Intelligence · Machine Learning

Details

Abstract

Multi-Agent Reinforcement Learning (MARL) struggles with coordination in sparse reward environments. Macro-actions —sequences of actions executed as single decisions— facilitate long-term planning but introduce asynchrony, complicating Centralized Training with Decentralized Execution (CTDE). Existing CTDE methods use padding to handle asynchrony, risking misaligned asynchronous experiences and spurious correlations. We propose the Agent-Centric Actor-Critic (ACAC) algorithm to manage asynchrony without padding. ACAC uses agent-centric encoders for independent trajectory processing, with an attention-based aggregation module integrating these histories into a centralized critic for improved temporal abstractions. The proposed structure is trained via a PPO-based algorithm with a modified Generalized Advantage Estimation for asynchronous environments. Experiments show ACAC accelerates convergence and enhances performance over baselines in complex MARL tasks.

Authors

Keywords

Multi-Agent Reinforcement Learning
Asynchronous Multi-Agent Reinforcement Learning
MacDec-POMDP

Context

Venue: International Conference on Machine Learning
Archive span: 1993-2025
Indexed papers: 16471
Paper id: 717752269396758942