Author name cluster

Bin Guo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

23 papers

1 author row

AAAI Conference 2026 Conference Paper

VIL2C: Value-of-Information Aware Low-Latency Communication for Multi-Agent Reinforcement Learning

Qian Zhang
Zhuo Sun
Yao Zhang
Zhiwen Yu
Bin Guo
Jun Zhang

Inter-agent communication serves as an effective mechanism for enhancing performance in collaborative multi-agent reinforcement learning (MARL) systems. However, the inherent communication latency in practical systems induces both action decision delays and outdated information sharing, impeding MARL performance gains, particularly in time-critical applications like autonomous driving. In this work, we propose a Value-of-Information aware Low-latency Communication (VIL2C) scheme that proactively adjusts the latency distribution to mitigate its effects in MARL systems. Specifically, we define a Value of Information (VoI) metric to quantify the importance of delayed messages on the recipient agent's decision. We then design a VoI aware resource allocation method that dynamically prioritizes message transmission based on each delayed message's importance. Moreover, we propose a progressive message reception mechanism to adaptively adjust the reception duration based on received messages. We derive the optimized VoI aware resource allocation and theoretically prove the performance advantage of the proposed VIL2C scheme. Extensive experiments demonstrate that VIL2C outperforms existing approaches under various communication conditions. These gains are attributed to the low-latency transmission of high-VoI messages via resource allocation and the elimination of unnecessary waiting periods via adaptive reception duration.

PDF Details DOI

JBHI Journal 2025 Journal Article

A Novel Recognition and Classification Approach for Motor Imagery Based on Spatio-Temporal Features

Renjie Lv
Wenwen Chang
Guanghui Yan
Wenchao Nie
Lei Zheng
Bin Guo
Muhammad Tariq Sadiq

Motor imagery, as a paradigm of brain-computer interface, holds vast potential in the field of medical rehabilitation. Addressing the challenges posed by the non-stationarity and low signal-to-noise ratio of EEG signals, the effective extraction of features from motor imagery signals for accurate recognition stands as a key focus in motor imagery brain-computer interface technology. This paper proposes a motor imagery EEG signal classification model that combines functional brain networks with graph convolutional networks. First, functional brain networks are constructed using different brain functional connectivity metrics, and graph theory features are calculated to deeply analyze the characteristics of brain networks under different motor tasks. Then, the constructed functional brain networks are combined with graph convolutional networks for the classification and recognition of motor imagery tasks. The analysis based on brain functional connectivity reveals that the functional connectivity strength during the both fists task is significantly higher than that of other motor imagery tasks, and the functional connectivity strength during actual movement is generally superior to that of motor imagery tasks. In experiments conducted on the Physionet public dataset, the proposed model achieved a classification accuracy of 88. 39% under multi-subject conditions, significantly outperforming traditional methods. Under single-subject conditions, the model effectively addressed the issue of individual variability, achieving an average classification accuracy of 99. 31%. These results indicate that the proposed model not only exhibits excellent performance in the classification of motor imagery tasks but also provides new insights into the functional connectivity characteristics of different motor tasks and their corresponding brain regions.

Details DOI

IJCAI Conference 2025 Conference Paper

ActiveHAI: Active Collection Based Human-AI Diagnosis with Limited Expert Predictions

Xuehan Zhao
Jiaqi Liu
Xin Zhang
Zhiwen Yu
Bin Guo

Recent studies indicate that human-AI collaboration performs better than either alone, particularly in medical diagnosis. Beyond collaboration methods that focus on assigning tasks to humans or AI, like deferral, combining human and AI decisions with their confidence scores is emerging as a promising strategy. Due to high cognitive load, doctors often struggle to provide confidence assessments, necessitating explicit human uncertainty evaluation through a limited number of additional expert predictions. There are two challenges. (1) how to actively collect limited yet representative expert predictions? (2) how to accurately evaluate human uncertainty with limited expert predictions? To address the challenges, we propose ActiveHAI, an active human-AI diagnosis method that reduces expert costs through a median-window sampling strategy that actively selects representative samples near the estimated median; and evaluate expert confidence through an evaluator module that integrates sample features and expert predictions, converting them into probability distributions. Experiments on three real-world datasets show that ActiveHAI surpasses doctor and other human-AI methods by 16. 3% and 3. 6% in accuracy, respectively. Furthermore, ActiveHAI reaches 97. 2% relative accuracy, even with just eight expert predictions per class.

PDF Details DOI

TIST Journal 2025 Journal Article

Balancing Cooperation and Competition: Selfish Worker Coalition Formation in Spatial Crowdsourcing

Liang Wang
Shan Su
Rongchang Cheng
Dingqi Yang
Lianbo Ma
Fei Xiong
Bin Guo
Zhiwen Yu

Spatial Crowdsourcing (SC), which outsources location-dependent tasks to workers for physical completion, is gaining popularity. Recently, more complex tasks have emerged that require a group of workers collaborating in a coalition. Several pioneering studies have examined this issue using the server assigned tasks mode from an overall perspective, such as maximizing the total benefits of all workers. Unfortunately, maximizing the overall benefit does not necessarily align with maximizing individual benefits. In practice, crowd workers are often self-interested and autonomous, making decisions based on their personal perspectives. In this article, under the worker selected tasks mode, we investigate an important problem: Selfish Workers Coalition Formation (SWCF) problem in SC. Here, selfish workers autonomously form coalitions to accomplish tasks to maximize their individual benefits. Achieving a stable coalition formation for SWCF problem requires balancing cooperation and competition. First, we transform the SWCF problem into a hedonic coalition formation game using a devised exploited skills-based reward distribution model. Subsequently, we propose a distributed algorithm HCFTA and prove its Nash stability and performance bounds. Additionally, to enhance coalition formation efficiency, we propose a Markov blanket coloring parallel optimization algorithm MCPHCF. Extensive experiments demonstrate the superiority of the proposed methods on both synthetic and real-world datasets.

Details DOI

YNIMG Journal 2025 Journal Article

Brain development during the lifespan of cynomolgus monkeys

Zhiqiang Tan
Binbin Nie
Huanhua Wu
Bang Li
Jingjie Shang
Tianhao Zhang
Zeyu Xiao
Chenchen Dong

F]FDG PET-MRI data from 228 healthy cynomolgus monkeys spanning the age range of 0.5-29.5 years to construct an age-specific multimodal image brain template toolset tailored to cynomolgus monkeys. Their brain volume and glucose metabolism were quantitatively analyzed by utilizing an individualized spatial segmentation algorithm. Our findings encapsulated the growth and development trends, sex differences, and asymmetrical variations in brain volume and glucose metabolism in cynomolgus monkeys, and analyzed the correlation between the brain volume and glucose metabolism. This endeavor enhances our capacity to leverage the cynomolgus monkey model in neuroscience research by providing a valuable resource for researchers. The age-specific brain template toolset and associated data offer a robust foundation for future investigations, facilitating a nuanced understanding of brain development in this primate species and, consequently, informing and advancing neuroscience research employing cynomolgus monkeys.

Details DOI

AAAI Conference 2025 Conference Paper

CollageNoter: Real-Time and Adaptive Collage Layout Design for Screenshot-Based E-Note-Taking

Qiuyun Zhang
Bin Guo
Lina Yao
Xiaotian Qiao
Ying Zhang
Zhiwen Yu

To enhance the processing of complex multi-modal documents (e.g. e-books, long web pages, etc.), it is an efficient way for users to take digital screenshots of key parts and reorganize them into a new collage E-Note. Existing methods for assisting collage layout design primarily employ a semantic relevance-first strategy, with arranging related contents together. Though capable, it can not ensure the visual readability of screenshots and may conflict with human natural reading patterns. In this paper, we introduce CollageNoter for real-time collage layout design that adapts to various devices (e.g. laptop, tablet, phone, etc.), offering users with visually and cognitively well-organized screenshot-based E-Notes. Specifically, we construct a novel two-stage pipeline for collage design, including 1) readability-first layout generation and 2) cognitive-driven layout adjustment. In addition, to achieve real-time response and adaptive model training, we propose a cascade transformer-based layout generator named CollageFormer and a size-aware collage layout builder for automatic dataset construction. Extensive experimental results have confirmed the effectiveness of our CollageNoter.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Tree-of-AdEditor: Heuristic Tree Reasoning for Automated Video Advertisement Editing with Large Language Model

Yuqi Zhang
Bin Guo
Nuo Li
Ying Zhang
Shijie Wang
Zhiwen Yu
Qing Li

Video advertising has become a popular marketing strategy on e-commerce platforms, requiring high-level semantic reasoning like selling point discovery, narrative organization. Previous rule-based methods struggle with these complex tasks, and learning-based approaches demand large datasets and high training costs. Recently, Large Language Models have opened incredible opportunities for advancing intelligent video advertisement editing. However, Input-output (IO) prompting and Chain-of-Thought (CoT) struggle to adapt to the nonlinear thinking hierarchy of video editing, where editors iteratively select shots or revert them to explore potential editing solutions. While Tree-of-Thought (ToT) offers a conceptual structure that mirrors this hierarchy, it falls short in aligning with effective video advertising strategies and lacks robust fact-checking mechanisms. To address these, we propose a novel framework, Tree-of-AdEditor (ToAE), which constructs a reasoning tree to mimic human editors, and incorporates domain-specific theories and heuristic fact-checking to identify optimal editing solutions. Specifically, motivated by effective advertisement principles, we develop a "local-global" mechanism to guide LLM in both the shot level and sequence level decision-making. We introduce a visual incoherence pruning module to provide external heuristic fact-checking, ensuring visual attractiveness and reducing computation costs. Quantitative experiments and expert evaluation demonstrate the superiority of our method compared to baselines.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

HAWK: Learning to Understand Open-World Video Anomalies

Jiaqi Tang
Hao Lu
Ruizheng Wu
Xiaogang Xu
Ke Ma
Cheng Fang
Bin Guo
Jiangbo Lu

Video Anomaly Detection (VAD) systems can autonomously monitor and identify disturbances, reducing the need for manual labor and associated costs. However, current VAD systems are often limited by their superficial semantic understanding of scenes and minimal user interaction. Additionally, the prevalent data scarcity in existing datasets restricts their applicability in open-world scenarios. In this paper, we introduce HAWK, a novel framework that leverages interactive large Visual Language Models (VLM) to interpret video anomalies precisely. Recognizing the difference in motion information between abnormal and normal videos, HAWK explicitly integrates motion modality to enhance anomaly identification. To reinforce motion attention, we construct an auxiliary consistency loss within the motion and video space, guiding the video branch to focus on the motion modality. Moreover, to improve the interpretation of motion-to-language, we establish a clear supervisory relationship between motion and its linguistic representation. Furthermore, we have annotated over 8, 000 anomaly videos with language descriptions, enabling effective training across diverse open-world scenarios, and also created 8, 000 question-answering pairs for users' open-world questions. The final results demonstrate that HAWK achieves SOTA performance, surpassing existing baselines in both video description generation and question-answering. Our codes/dataset/demo will be released at https: //github. com/jqtangust/hawk.

PDF Details DOI

TIST Journal 2024 Journal Article

Learning Cross-modality Interaction for Robust Depth Perception of Autonomous Driving

Yunji Liang
Nengzhen Chen
Zhiwen Yu
Lei Tang
Hongkai Yu
Bin Guo
Daniel Dajun Zeng

As one of the fundamental tasks of autonomous driving, depth perception aims to perceive physical objects in three dimensions and to judge their distances away from the ego vehicle. Although great efforts have been made for depth perception, LiDAR-based and camera-based solutions have limitations with low accuracy and poor robustness for noise input. With the integration of monocular cameras and LiDAR sensors in autonomous vehicles, in this article, we introduce a two-stream architecture to learn the modality interaction representation under the guidance of an image reconstruction task to compensate for the deficiencies of each modality in a parallel manner. Specifically, in the two-stream architecture, the multi-scale cross-modality interactions are preserved via a cascading interaction network under the guidance of the reconstruction task. Next, the shared representation of modality interaction is integrated to infer the dense depth map due to the complementarity and heterogeneity of the two modalities. We evaluated the proposed solution on the KITTI dataset and CALAR synthetic dataset. Our experimental results show that learning the coupled interaction of modalities under the guidance of an auxiliary task can lead to significant performance improvements. Furthermore, our approach is competitive against the state-of-the-art models and robust against the noisy input. The source code is available at https://github.com/tonyFengye/Code/tree/master.

Details DOI

AAMAS Conference 2023 Conference Paper

Learning to Self-Reconfigure for Freeform Modular Robots via Altruism Multi-Agent Reinforcement Learning

Lei Wu
Bin Guo
Qiuyun Zhang
Zhuo Sun
Jieyi Zhang
Zhiwen Yu

Modular robots can change between different configurations to adapt to complex and dynamic environments. Therefore, performing accurate and efficient changes to modular robot system, known as the self-reconfiguration problem, is essential. Existing reconfiguration algorithms are based on discrete motion primitives. However, freeform modular robots are connected without alignment and their motion space is continuous, making existing reconfiguration methods infeasible. In this work, we design a parallel distributed self-reconfiguration algorithm based on multi-agent reinforcement learning for freeform modular robots. We introduce a collaboration mechanism into the reinforcement learning to avoid conflicts in continuous action spaces. Simulations show that our algorithm reduces conflicts and improves effectiveness compared to the baselines.

PDF

IJCAI Conference 2023 Conference Paper

Learning to Self-Reconfigure for Freeform Modular Robots via Altruism Proximal Policy Optimization

Lei Wu
Bin Guo
Qiuyun Zhang
Zhuo Sun
Jieyi Zhang
Zhiwen Yu

The advantages of modular robot systems stem from their ability to change between different configurations, enabling them to adapt to complex and dynamic real-world environments. Then, how to perform the accurate and efficient change of the modular robot system, i. e. , the self-reconfiguration problem, is essential. Existing reconfiguration algorithms are based on discrete motion primitives and are suitable for lattice-type modular robots. The modules of freeform modular robots are connected without alignment, and the motion space is continuous. It renders existing reconfiguration methods infeasible. In this paper, we design a parallel distributed self-reconfiguration algorithm for freeform modular robots based on multi-agent reinforcement learning to realize the automatic design of conflict-free reconfiguration controllers in continuous action spaces. To avoid conflicts, we incorporate a collaborative mechanism into reinforcement learning. Furthermore, we design the distributed termination criteria to achieve timely termination in the presence of limited communication and local observability. When compared to the baselines, simulations show that the proposed method improves efficiency and congruence, and module movement demonstrates altruism.

PDF Details DOI

TIST Journal 2023 Journal Article

Modeling Within-Basket Auxiliary Item Recommendation with Matchability and Ubiquity

En Xu
Zhiwen Yu
Zhuo Sun
Bin Guo
Lina Yao

Within-basket recommendation is to recommend suitable items for the current basket with some already known items. The within-basket auxiliary item recommendation ( WBAIR ) is to recommend auxiliary items based on the primary items in the basket. Such a task exists in many real-life scenarios. Unlike the associations between items that can be transmitted in both directions, primary and auxiliary relationships are unidirectional. Then, the suitable matching patterns between primary and auxiliary items cannot be explored by traditional directionless methods. Therefore, we design the Matc4Rec algorithm to integrate the primary and auxiliary factors, and finally recommend items that not only match the interests of users but also satisfy the primary and auxiliary relationships between items. Specifically, we capture the pattern from three aspects: matchability within-basket, matchability between baskets, and ubiquity. By exploiting this pattern, the designed algorithm not only achieves good results on real-world datasets but also improves the interpretability of recommendations. As a result, we can know which commodities are suitable as auxiliary items. The experiment results demonstrate that our algorithm can also alleviate the cold start problem.

Details DOI