Author name cluster

Chengpei Tang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

AAAI Conference 2026 Conference Paper

Cost-Effective Communication: An Auction-based Method for Language Agent Interaction

Yijia Fan
jusheng zhang
Kaitong Cai
Jing Yang
Chengpei Tang
Jian Wang
Keze Wang

Multi-agent systems (MAS) built on large language models (LLMs) often suffer from inefficient ''free-for-all'' communication, leading to exponential token costs and low signal-to-noise ratios that hinder their practical deployment. We challenge the notion that more communication is always beneficial, hypothesizing instead that the core issue is the absence of resource rationality. We argue that "free'' communication, by ignoring the principle of scarcity, inherently breeds inefficiency and unnecessary expenses. To address this, we introduce the Dynamic Auction-based Language Agent (DALA), a novel framework that treats communication bandwidth as a scarce and tradable resource. Specifically, our DALA regards inter-agent communication as a centralized auction, where agents learn to bid for the opportunity to speak based on the predicted value density of their messages. Thus, our DALA intrinsically encourages agents to produce concise, informative messages while filtering out low-value communication. Extensive and comprehensive experiments demonstrate that our economically-driven DALA achieves new state-of-the-art performance across seven challenging reasoning benchmarks, including 84.32% on MMLU and a 91.21% pass@1 rate on HumanEval. Note that this is accomplished with remarkable efficiency, i.e., our DALA uses only 6.25 million tokens, a fraction of the resources consumed by current state-of-the-art methods on GSM8K. Further analysis reveals that our DALA cultivates the emergent skill of strategic silence, effectively adapting its communication strategies from verbosity to silence in a dynamic manner via resource constraints.

PDF Details DOI

AAAI Conference 2026 Conference Paper

HiVA: Self-organized Hierarchical Variable Agent via Goal-driven Semantic-Topological Evolution

Jinzhou Tang
jusheng zhang
Qinhan Lv
Sidi Liu
Jing Yang
Chengpei Tang
Keze Wang

Autonomous agents play a crucial role in advancing Artificial General Intelligence, enabling problem decomposition and tool orchestration through Large Language Models (LLMs). However, existing paradigms face a critical trade-off. On one hand, reusable fixed workflows require manual reconfiguration upon environmental changes; on the other hand, flexible reactive loops fail to distill reasoning progress into transferable structures. We introduce Hierarchical Variable Agent (HiVA), a novel framework modeling agentic workflows as self-organized graphs with the Semantic-Topological Evolution (STEV) algorithm, which optimizes hybrid semantic-topological spaces using textual gradients as discrete-domain surrogates for backpropagation. The iterative process comprises Multi-Armed Bandit-infused forward routing, diagnostic gradient generation from environmental feedback, and coordinated updates that co-evolve individual semantics and topology for collective optimization in unknown environments. Experiments on dialogue, coding, Long-context Q&A, mathematical, and agentic benchmarks demonstrate improvements of 5-10% in task accuracy and enhanced resource efficiency over existing baselines, establishing HiVA's effectiveness in autonomous task execution.

PDF Details DOI

JBHI Journal 2026 Journal Article

IMRadar: Bidirectional Velocity Mamba for Contactless Human Behavior Sensing

Ming Zhang
Jiong Liang
Jiayao Li
Shaolin Liao
Henry Soekmadji
Chengpei Tang

In recent years, intelligent human behavior sensing based on channel state information (CSI) has garnered significant attention from researchers, serving as a pivotal application of contactless health monitoring. However, the feature extraction networks used in existing perception schemes have significant limitations in terms of global context perception, computational complexity, and only consider features in one direction. To address these issues, this article proposes a novel bidirectional velocity Mamba (BVMamba) model and constructs an intelligent behavior sensing system, named IMRadar. The system first analyzes the velocity information that better characterizes the human motion state from CSI data, and uses the BVMamba model to extract global deep behavioral features from both forward and reverse directions. The BVMamba model includes forward velocity Mamba block (FVMamba), reverse velocity Mamba block (RVMamba), and bidirectional velocity feature fusion block (FUBlock), which can comprehensively capture the dynamic characteristics of complex behaviors. Experiments have shown that IMRadar exhibits excellent recognition performance on both publicly available datasets (ARIL, Widar) and self-built dataset (IM-HAR), with accuracy rates exceeding 98% for all datasets, providing an efficient and robust solution for non-contact behavior perception technology.

Details DOI

AAAI Conference 2026 Conference Paper

Top-Down Semantic Refinement for Image Captioning

jusheng zhang
Kaitong Cai
Jing Yang
Jian Wang
Chengpei Tang
Keze Wang

Large Vision-Language Models (VLMs) face an inherent contradiction in image captioning: their powerful single-step generation capabilities often lead to a myopic decision-making process. This makes it difficult to maintain global narrative coherence while capturing rich details, a limitation that is particularly pronounced in tasks that require multi-step and complex scene description. To overcome this fundamental challenge, we redefine image captioning as a goal-oriented hierarchical refinement planning problem, and further propose a novel framework, named Top-Down Semantic Refinement (TDSR), which models the generation process as a Markov Decision Process (MDP). However, planning within the vast state space of a VLM presents a significant computational hurdle. Our core contribution, therefore, is the design of a highly efficient Monte Carlo Tree Search (MCTS) algorithm tailored for VLMs. By incorporating a visual-guided parallel expansion and a lightweight value network, our TDSR reduces the call frequency to the expensive VLM by an order of magnitude without sacrificing planning quality. Furthermore, an adaptive early stopping mechanism dynamically matches computational overhead to the image's complexity. Extensive experiments on multiple benchmarks, including DetailCaps, COMPOSITIONCAP, and POPE, demonstrate that our TDSR, as a plug-and-play module, can significantly enhance the performance of existing VLMs (e.g., LLaVA-1.5, Qwen2.5-VL) by achieving state-of-the-art or highly competitive results in fine-grained description, compositional generalization, and hallucination suppression.

PDF Details DOI