Arrow Research search

Author name cluster

Weize Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
2 author rows

Possible papers

11

JBHI Journal 2025 Journal Article

An Online Adaptation Framework for Enhancing Calibration-Free SSVEP-Based BCI Performance

  • Weize Chen
  • Jie Mei
  • Xiaolin Xiao
  • Ang Li
  • Lingling Tao
  • Kun Wang
  • Minpeng Xu
  • Dong Ming

Accomplishing a plug-and-play steady-state visual evoked potential (SSVEP)-based brain-computer interface (BCI) remains a critical challenge, due to the unsatisfying performance of calibration-free decoding algorithms. A current method called online adaptive canonical correlation analysis (OACCA) has proved efficient in enhancing calibration-free performance by self-adaptation merely with online data. However, OACCA only concerns the adaptation of spatial filters and excludes other useful adaptive procedures like individual template estimation, hindering fully exploitable model decoding and adaptation. This study proposes a new online adaptation framework termed online adaptive extended correlation analysis (OAECA) to augment the calibration-free online adaptation loop. OAECA first recalls and cleans the online trials for reliable data learning, then tunes individual templates and spatial filters for complete model updating, and finally adopts extended feature matching to improve target recognition. The simulation results on two public SSVEP datasets revealed that OAECA significantly outperformed OACCA for almost all 105 subjects, and both offline and online experiments further confirmed the effectiveness of OAECA. Particularly, OAECA achieved the highest average information transfer rate (ITR) of 202. 17 bits/min in the online experiment, significantly exceeding the state-of-the-art OACCA of 177. 02 bits/min. This study enhanced the calibration-free performance through comprehensive online adaptation, hopefully advancing SSVEP-based BCIs toward practical plug-and-play real-world applications.

ICLR Conference 2025 Conference Paper

Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence

  • Weize Chen
  • Ziming You
  • Ran Li
  • Yitong Guan
  • Chen Qian
  • Chenyang Zhao
  • Cheng Yang 0002
  • Ruobing Xie

The rapid advancement of large language models (LLMs) has paved the way for the development of highly capable autonomous agents. However, existing multi-agent frameworks often struggle with integrating diverse capable third-party agents due to reliance on agents defined within their own ecosystems. They also face challenges in simulating distributed environments, as most frameworks are limited to single-device setups. Furthermore, these frameworks often rely on hard-coded communication pipelines, limiting their adaptability to dynamic task requirements. Inspired by the concept of the Internet, we propose the Internet of Agents (IoA), a novel framework that addresses these limitations by providing a flexible and scalable platform for LLM-based multi-agent collaboration. IoA introduces an agent integration protocol, an instant-messaging-like architecture design, and dynamic mechanisms for agent teaming and conversation flow control. Through extensive experiments on general assistant tasks, embodied AI tasks, and retrieval-augmented generation benchmarks, we demonstrate that IoA consistently outperforms state-of-the-art baselines, showcasing its ability to facilitate effective collaboration among heterogeneous agents. IoA represents a step towards linking diverse agents in an Internet-like environment, where agents can seamlessly collaborate to achieve greater intelligence and capabilities. We will release our code to facilitate further research.

NeurIPS Conference 2025 Conference Paper

Multi-Agent Collaboration via Evolving Orchestration

  • Yufan Dang
  • Chen Qian
  • Xueheng Luo
  • Jingru Fan
  • Zihao Xie
  • Ruijie Shi
  • Weize Chen
  • Cheng Yang

Large language models (LLMs) have achieved remarkable results across diverse downstream tasks, but their monolithic nature restricts scalability and efficiency in complex problem-solving. While recent research explores multi-agent collaboration among LLMs, most approaches rely on static organizational structures that struggle to adapt as task complexity and agent numbers grow, resulting in coordination overhead and inefficiencies. To this end, we propose a puppeteer-style paradigm for LLM-based multi-agent collaboration, where a centralized orchestrator ("puppeteer") dynamically directs agents ("puppets") in response to evolving task states. This orchestrator is trained via reinforcement learning to adaptively sequence and prioritize agents, enabling flexible and evolvable collective reasoning. Experiments on closed- and open-domain scenarios show that this method achieves superior performance with reduced computational costs. Analyses further reveal that the key improvements consistently stem from the emergence of more compact, cyclic reasoning structures under the orchestrator’s evolution. Our code is available at https: //github. com/OpenBMB/ChatDev/tree/puppeteer.

ICLR Conference 2025 Conference Paper

Scaling Large Language Model-based Multi-Agent Collaboration

  • Chen Qian
  • Zihao Xie
  • Yifei Wang
  • Wei Liu 0161
  • Kunlun Zhu
  • Hanchen Xia
  • Yufan Dang
  • Zhuoyun Du

Recent breakthroughs in large language model-driven autonomous agents have revealed that multi-agent collaboration often surpasses each individual through collective reasoning. Inspired by the neural scaling law—increasing neurons enhances performance, this study explores whether the continuous addition of collaborative agents can yield similar benefits. Technically, we utilize directed acyclic graphs to organize agents into a multi-agent collaboration network (MacNet), upon which their interactive reasoning is topologically orchestrated for autonomous task solving. Extensive evaluations reveal that it effectively supports collaboration among over a thousand agents, with irregular topologies outperforming regular ones. We also identify a collaborative scaling law—the overall performance follows a logistic growth pattern as agents scale, with collaborative emergence occurring earlier than traditional neural emergence. We speculate this may be because scaling agents catalyzes their multidimensional considerations during interactive reflection and refinement, thereby producing more comprehensive artifacts. The code is available at https://github.com/OpenBMB/ChatDev/tree/macnet.

NeurIPS Conference 2025 Conference Paper

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

  • Weize Chen
  • Jiarui yuan
  • Jin Tailin
  • Ning Ding
  • Huimin Chen
  • Zhiyuan Liu
  • Maosong Sun

Recent large language models (LLMs) exhibit impressive reasoning but often \textit{overthink}, generating excessively long responses that hinder efficiency. We introduce DIET (DIfficulty-AwarE Training), a framework that systematically cuts these "token calories" by integrating on-the-fly problem difficulty into the reinforcement learning (RL) process. DIET dynamically adapts token compression strategies by modulating token penalty strength and conditioning target lengths on estimated task difficulty, to optimize the performance-efficiency trade-off. We also theoretically analyze the pitfalls of naive reward weighting in group-normalized RL algorithms like GRPO, and propose \textit{Advantage Weighting} technique, which enables stable and effective implementation of these difficulty-aware objectives. Experimental results demonstrate that DIET significantly reduces token counts while simultaneously improving reasoning performance. Beyond raw token reduction, we show two crucial benefits largely overlooked by prior work: (1) DIET leads to superior \textbf{inference scaling}. By maintaining high per-sample quality with fewer tokens, it enables better scaling performance via majority voting under fixed computational budgets, an area where other methods falter. (2) DIET enhances the natural positive correlation between response length and problem difficulty, ensuring verbosity is appropriately allocated, unlike many existing compression methods that disrupt this relationship. Our analyses provide a principled and effective framework for developing more efficient, practical, and high-performing LLMs.

ICLR Conference 2024 Conference Paper

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

  • Weize Chen
  • Yusheng Su
  • Jingwei Zuo
  • Cheng Yang 0002
  • Chenfei Yuan
  • Chi-Min Chan
  • Heyang Yu
  • Yaxi Lu

Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework AgentVerse that can effectively orchestrate a collaborative group of expert agents as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that AgentVerse can proficiently deploy multi-agent groups that outperform a single agent. Extensive experiments on text understanding, reasoning, coding, tool utilization, and embodied AI confirm the effectiveness of AgentVerse. Moreover, our analysis of agent interactions within AgentVerse reveals the emergence of specific collaborative behaviors, contributing to heightened group efficiency. We will release our codebase, AgentVerse, to further facilitate multi-agent research.

NeurIPS Conference 2024 Conference Paper

Autonomous Agents for Collaborative Task under Information Asymmetry

  • Wei Liu
  • Chenxi Wang
  • Yifei Wang
  • Zihao Xie
  • Rennai Qiu
  • Yufan Dang
  • Zhuoyun Du
  • Weize Chen

Large Language Model Multi-Agent Systems (LLM-MAS) have greatly progressed in solving complex tasks. It communicates among agents within the system to collaboratively solve tasks, under the premise of shared information. However, when agents' collaborations are leveraged to perform multi-person tasks, a new challenge arises due to information asymmetry, since each agent can only access the information of its human user. Previous MAS struggle to complete tasks under this condition. To address this, we propose a new MAS paradigm termed iAgents, which denotes Informative Multi-Agent Systems. In iAgents, the human social network is mirrored in the agent network, where agents proactively exchange human information necessary for task resolution, thereby overcoming information asymmetry. iAgents employs a novel agent reasoning mechanism, InfoNav, to navigate agents' communication towards effective information exchange. Together with InfoNav, iAgents organizes human information in a mixed memory to provide agents with accurate and comprehensive information for exchange. Additionally, we introduce InformativeBench, the first benchmark tailored for evaluating LLM agents' task-solving ability under information asymmetry. Experimental results show that iAgents can collaborate within a social network of 140 individuals and 588 relationships, autonomously communicate over 30 turns, and retrieve information from nearly 70, 000 messages to complete tasks within 3 minutes.

NeurIPS Conference 2024 Conference Paper

Can Large Language Models Analyze Graphs like Professionals? A Benchmark, Datasets and Models

  • Xin Li
  • Weize Chen
  • Qizhi Chu
  • Haopeng Li
  • Zhaojun Sun
  • Ran Li
  • Chen Qian
  • Yiwei Wei

The need to analyze graphs is ubiquitous across various fields, from social networks to biological research and recommendation systems. Therefore, enabling the ability of large language models (LLMs) to process graphs is an important step toward more advanced general intelligence. However, current LLM benchmarks on graph analysis require models to directly reason over the prompts describing graphtopology, and are thus limited to small graphs with only a few dozens of nodes. In contrast, human experts typically write programs based on popular libraries for task solving, and can thus handle graphs with different scales. To this end, a question naturally arises: can LLMs analyze graphs like professionals? In this paper, we introduce ProGraph, a manually crafted benchmark containing 3 categories of graph tasks. The benchmark expects solutions based on programming instead of directly reasoning over raw inputs. Our findings reveal that the performance of current LLMs is unsatisfactory, with the best model achieving only 36% accuracy. To bridge this gap, we propose LLM4Graph datasets, which include crawled documents and auto-generated codes based on 6 widely used graph libraries. By augmenting closed-source LLMs with document retrieval and fine-tuning open-source ones on the codes, we show 11-32% absolute improvements in their accuracies. Our results underscore that the capabilities of LLMs in handling structured data are still under-explored, and show the effectiveness of LLM4Graph in enhancing LLMs’ proficiency of graph analysis. The benchmark, datasets and enhanced open-sourcemodels are available at https: //github. com/BUPT-GAMMA/ProGraph.

ICLR Conference 2024 Conference Paper

ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate

  • Chi-Min Chan
  • Weize Chen
  • Yusheng Su
  • Jianxuan Yu
  • Wei Xue 0002
  • Shanghang Zhang
  • Jie Fu 0001
  • Zhiyuan Liu 0001

Text evaluation has historically posed significant challenges, often demanding substantial labor and time cost. With the emergence of large language models (LLMs), researchers have explored LLMs' potential as alternatives for human evaluation. While these single-agent-based approaches show promise, experimental results suggest that further advancements are needed to bridge the gap between their current effectiveness and human-level evaluation quality. Recognizing that best practices of human evaluation processes often involve multiple human annotators collaborating in the evaluation, we resort to a multi-agent debate framework, moving beyond single-agent prompting strategies. In this paper, we construct a multi-agent referee team called $\textbf{ChatEval}$ to autonomously discuss and evaluate the quality of different texts. Our experiments on two benchmarks illustrate that ChatEval delivers superior accuracy and correlation in alignment with human assessment. Furthermore, we find that the diverse role prompts (different personas) are essential in the multi-agent debate process; that is, utilizing the same role description in the prompts can lead to a degradation in performance. Our qualitative analysis also shows that ChatEval transcends mere textual scoring, offering a human-mimicking evaluation process for reliable assessments.

ICML Conference 2022 Conference Paper

GACT: Activation Compressed Training for Generic Network Architectures

  • Xiaoxuan Liu
  • Lianmin Zheng
  • Dequan Wang
  • Yukuo Cen
  • Weize Chen
  • Xu Han 0007
  • Jianfei Chen 0001
  • Zhiyuan Liu 0001

Training large neural network (NN) models requires extensive memory resources, and Activation Compression Training (ACT) is a promising approach to reduce training memory footprint. This paper presents GACT, an ACT framework to support a broad range of machine learning tasks for generic NN architectures with limited domain knowledge. By analyzing a linearized version of ACT’s approximate gradient, we prove the convergence of GACT without prior knowledge on operator type or model architecture. To make training stable, we propose an algorithm that decides the compression ratio for each tensor by estimating its impact on the gradient at run time. We implement GACT as a PyTorch library that readily applies to any NN architecture. GACT reduces the activation memory for convolutional NNs, transformers, and graph NNs by up to 8. 1x, enabling training with a 4. 2x to 24. 7x larger batch size, with negligible accuracy loss.