Author name cluster

Ao Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

AAAI Conference 2026 Conference Paper

AgentMental: An Interactive Multi-Agent Framework for Explainable and Adaptive Mental Health Assessment

Jinpeng Hu
Ao Wang
Qianqian Xie
Zhuo Li
Hui Ma
Dan Guo

Mental health assessment is crucial for early intervention and effective treatment, yet traditional clinician-based approaches are limited by the shortage of qualified professionals. Recent advances in artificial intelligence have sparked growing interest in automated psychological assessment, yet most existing approaches are constrained by their reliance on static text analysis, limiting their ability to capture deeper and more informative insights that emerge through dynamic interaction and iterative questioning. Therefore, in this paper, we propose a multi-agent framework for mental health evaluation that simulates clinical doctor-patient dialogues, with specialized agents assigned to questioning, adequacy evaluation, scoring, and updating. In detail, we introduce an adaptive questioning mechanism in which an evaluation agent assesses the adequacy of user responses to determine the necessity of generating targeted follow-up queries to address ambiguity and missing information. Additionally, we employ a tree-structured memory in which the root node encodes the user's basic information, while child nodes (e.g., topic and statement) organize key information according to distinct symptom categories and interaction turns. This memory is dynamically updated throughout the interaction to reduce redundant questioning and enhance the information extraction and contextual tracking capabilities. Experimental results on the DAIC-WOZ dataset illustrate the effectiveness of our proposed method, which achieves better performance than existing approaches. Our code is released at \url{https://github.com/MindIntLab-HFUT/AgentMental}.

PDF Details DOI

IROS Conference 2025 Conference Paper

High-Precision Parallel Manipulation of Multi-Particle System Using Optoelectronic Tweezers

Shunxiao Huang
Jiawei Zhao
Chunyuan Gan
Zijin Zeng
Hongyi Xiong
Jingwen Ye
Wenyan Niu
Ao Wang

This paper presents a multi-particle parallel manipulation optoelectronic tweezers system integrated with computer vision technology, enabling the parallel and precise manipulation of dozens of particles. This system significantly enhances manipulation efficiency while maintaining high precision. By real-time monitoring of particle motion and light patterns, the system can rapidly adjust and optimize its manipulation strategy, thereby improving the stability and reliability of multi-particle synchronization in complex environments. Extensive experimental results demonstrate the system’s outstanding performance. For instance, it can quickly arrange complex patterns and letter sequences, facilitate the coordinated assembly of organoids from particle groups, and efficiently perform the precise separation and arrangement of mixed particles. The core advantage of this system lies in its high parallelism and flexibility, enabling it to handle large-scale synchronous manipulation tasks with exceptional operating accuracy. With continuous technological advancements and the broadening of application scenarios, this system is expected to have a profound impact in fields such as cell sorting, micro-device assembly, and organoid construction, providing robust support for research and technological development in these areas.

Details

IROS Conference 2025 Conference Paper

Inducing Desired Equilibria in Constrained Noncooperative Games via Nudging

Ao Wang
Min Meng 0003
Xiuxian Li

This paper explores nudge schemes for a central regulator aimed at incentivizing players in constrained noncooperative games to reach desired equilibria. Unlike traditional intervention mechanisms where players update their actions by blindly following signals from the regulator, the nudge mechanism comprehensively integrates players’ rational judgment by incorporating trust variables into players’ models. This implies that players update their actions by evaluating the signals from the regulator and their own expectations to the incentive mechanisms. If the regulator’s signals significantly deviate from the players’ expectations, players decrease trust in the regulator and rely more on the their expectations when updating their actions. Conversely, if the signals align closely with their expectations, players tend to increase trust in the regulator and place greater emphasis on the regulator’s signals. It should be noted that each player does not have access to other players’ actions, which means that it updates its action in a distributed manner by only observing the actions of its neighbors through a directed balanced graph. Furthermore, static and dynamic nudges are designed based on different information available to the regulator, which are also extended to an online case with the desired equilibrum being time-varying. Finally, an application to the robot formation control is shown to validate the obtained results.

Details

IROS Conference 2025 Conference Paper

Optoelectronic Navigation-Based Microtruck: For Efficient Cargo Loading, Transport, and Unloading

Ao Wang
Wenyan Niu
Caiding Ni
Shunxiao Huang
Yingjian Guo
Lin Feng 0002

This study proposes an optoelectronic navigation strategy leveraging Ag-SiO 2 microspheres as “microtruck” to overcome the limitations of traditional optoelectronic tweezers (OET) in manipulating negative dielectrophoresis (nDEP) particles. By dynamically adjusting electric field frequency and optical parameters, we regulate particle-induced dielectrophoretic forces (PiDEP) to achieve efficient adsorption, high-speed transport, and site-specific unloading of nDEP-responsive cargo. Experimental results demonstrate a seven times enhancement in manipulation velocity compared to conventional direct optical methods, along with the capability for simultaneous multi-particle transport. In addition, we utilized finite element simulations to analyze the optimal electric field frequency and optical parameters for the microtruck’s loading and unloading processes. Furthermore, a systematic analysis of critical velocities and failure modes under varying cargo loads further validates the robustness of this approach. Demonstrated within a labyrinthine microenvironment, this strategy enables programmable navigation, sequential cargo handling, and micrometer positional accuracy. This study provides an efficient solution for biomedical applications, including precise single-cell manipulation and targeted drug delivery.

Details

NeurIPS Conference 2025 Conference Paper

PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation

Ao Wang
Hui Chen
Jianchao Tan
Kefeng Zhang
Xunliang Cai
Zijia Lin
Jungong Han
Guiguang Ding

Recently, large vision-language models (LVLMs) have rapidly gained popularity for their strong generation and reasoning capabilities given diverse multimodal inputs. However, these models incur significant computational and memory overhead during inference, which greatly hinders the efficient deployment in practical scenarios. The extensive key-value (KV) cache, necessitated by the lengthy input and output sequences, notably contributes to the high inference cost. Based on this, recent works have investigated ways to reduce the KV cache size for higher efficiency. Although effective, they generally overlook the distinct importance distributions of KV vectors across layers and maintain the same cache size for each layer during the next token prediction. This results in the significant contextual information loss for certain layers, leading to notable performance decline. To address this, we present PrefixKV. It reframes the challenge of determining KV cache sizes for all layers into the task of searching for the optimal global prefix configuration. With an adaptive layer-wise KV retention recipe based on binary search, the maximum contextual information can thus be preserved in each layer, facilitating the generation. Extensive experiments demonstrate that our method achieves the state-of-the-art performance compared with others. It exhibits superior inference efficiency and generation quality trade-offs, showing promising potential for practical applications. Code is available at https: //github. com/THU-MIG/PrefixKV.

PDF Details

AAAI Conference 2025 Conference Paper

Promptable Anomaly Segmentation with SAM Through Self-Perception Tuning

Hui-Yue Yang
Hui Chen
Ao Wang
Kai Chen
Zijia Lin
Yongliang Tang
Pengcheng Gao
Yuming Quan

Segment Anything Model (SAM) has made great progress in anomaly segmentation tasks due to its impressive generalization ability. However, existing methods that directly apply SAM through prompting often overlook the domain shift issue, where SAM performs well on natural images but struggles in industrial scenarios. Parameter-Efficient Fine-Tuning (PEFT) offers a promising solution, but it may yield suboptimal performance by not adequately addressing the perception challenges during adaptation to anomaly images. In this paper, we propose a novel Self-Perception Tuning (SPT) method, aiming to enhance SAM's perception capability for anomaly segmentation. The SPT method incorporates a self-drafting tuning strategy, which generates an initial coarse draft of the anomaly mask, followed by a refinement process. Additionally, a visual-relation-aware adapter is introduced to improve the perception of discriminative relational information for mask generation. Extensive experimental results on several benchmark datasets demonstrate that our SPT method can significantly outperform baseline methods, validating its effectiveness.

PDF Details DOI

IROS Conference 2025 Conference Paper

SRCNet: Super-resolution Networks for Capsule Endoscope Robots

Menglu Tan
Guangdong Zhan
Zijin Zeng
Ao Wang
Lin Feng 0002

In recent years, capsule robots have gained wide acceptance among doctors and patients for the examination of gastrointestinal diseases due to their non-invasive, safe, and painless advantages. However, the image resolution captured by capsule robots is limited by space size and power, which hinders doctors' ability to accurately assess patients' stomach conditions and real-time control of the capsule robot. This paper proposes the design of two super-resolution networks for capsule robot videos. The first network, EndoVSR, is a high-performance offline video super-resolution network based on a generative adversarial network. It is designed to enhance the resolution of captured videos during offline processing. The second network, Bi-RUN, is a real-time video super-resolution network based on recurrent neural networks. It is designed to enhance the resolution of videos in real-time, enabling doctors to have a clearer view of the stomach condition during the examination. Extensive training and verification of these networks have been conducted using different datasets. All the performance indicators achieved leading positions. Furthermore, simulation experiments were carried out on pig stomachs in vitro to further validate the performance of the proposed networks in practical applications.

Details

ICRA Conference 2024 Conference Paper

Dynamic Adaptive Imaging System on Optoelectronic Tweezers Platform

Ao Wang
Chunyuan Gan
Haocheng Han
Hongyi Xiong
Jiawei Zhao
Chutian Wang
Lin Feng 0002

Optoelectronic tweezers (OET) has shown great promise in various applications, especially in the precise manipulation of microparticles and microorganisms on a micron and nanometer scale. This technology significantly enhances the efficiency of single-cell sorting and the development of antibody-based drugs. However, conventional OET platforms are limited by issues such as low autofocusing accuracy, restricted imaging field of view, and uneven illumination. To overcome these limitations, we have innovatively developed a dynamic adaptive imaging system. By incorporating peak-finding and in situ Gaussian blur compensation algorithms, we achieved rapid automatic focusing and illumination shadow compensation across an expanded field of view. At the same time, the system can also dynamically adjust compensation parameters under different lighting conditions. Our system has successfully completed comprehensive scanning of the optoelectronic tweezers chip, achieving a 60% reduction in autofocus time and a 15. 8% improvement in lighting uniformity. Moreover, this imaging system demonstrates robust versatility and can serve as a reference for other optical systems.

Details

NeurIPS Conference 2024 Conference Paper

YOLOv10: Real-Time End-to-End Object Detection

Ao Wang
Hui Chen
Lihao Liu
Kai Chen
Zijia Lin
Jungong Han
Guiguang Ding

Over the past years, YOLOs have emerged as the predominant paradigm in the field of real-time object detection owing to their effective balance between computational cost and detection performance. Researchers have explored the architectural designs, optimization objectives, data augmentation strategies, and others for YOLOs, achieving notable progress. However, the reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs and adversely impacts the inference latency. Besides, the design of various components in YOLOs lacks the comprehensive and thorough inspection, resulting in noticeable computational redundancy and limiting the model's capability. It renders the suboptimal efficiency, along with considerable potential for performance improvements. In this work, we aim to further advance the performance-efficiency boundary of YOLOs from both the post-processing and the model architecture. To this end, we first present the consistent dual assignments for NMS-free training of YOLOs, which brings the competitive performance and low inference latency simultaneously. Moreover, we introduce the holistic efficiency-accuracy driven model design strategy for YOLOs. We comprehensively optimize various components of YOLOs from both the efficiency and accuracy perspectives, which greatly reduces the computational overhead and enhances the capability. The outcome of our effort is a new generation of YOLO series for real-time end-to-end object detection, dubbed YOLOv10. Extensive experiments show that YOLOv10 achieves the state-of-the-art performance and efficiency across various model scales. For example, our YOLOv10-S is 1. 8$\times$ faster than RT-DETR-R18 under the similar AP on COCO, meanwhile enjoying 2. 8$\times$ smaller number of parameters and FLOPs. Compared with YOLOv9-C, YOLOv10-B has 46\% less latency and 25\% fewer parameters for the same performance. Code and models are available at https: //github. com/THU-MIG/yolov10.

PDF Details DOI

IROS Conference 2023 Conference Paper

Parallel Cell Array Patterning and Target Cell Lysis on an Optoelectronic Micro-Well Device

Chunyuan Gan
Hongyi Xiong
Jiawei Zhao
Ao Wang
Chutian Wang
Shuzhang Liang
Jiaying Zhang
Lin Feng 0002

This work presents a novel electrical method, implemented in the form of a microfluidic device, for cell arraying and target cell lysis. The microfluidic device contains a micro-well array on the photoconductive layer based on the optoelectronic tweezers (OET) method, where parallel cell manipulation is performed. As cell suspension flows over the micro-wells, cells can be actively captured in the micro-wells by light-induced dielectrophoresis (DEP) forces, form the designed pattern array in less than 120 s. The single-cell capture rate is over 83 % in the patterned cell array, and about 94% of micro-wells are occupied by cells. Then, the target cell in the specific micro-well is illuminated and lysed by electroporation in 5 seconds. The micro-well barriers and DEP forces block the influence of the flow, and a relatively closed space is critical to preserve the cell lysates. Through experiments, light-induced DEP force cell capture and target cell electroporation can be modulated by changing the light patterns and the applied signal. This device, based on the OET and dynamic electroporation, allows the rapidity in the cell capture and target lysis at the single-cell level and can enable single-cell-based studies, such as molecular diagnostics and disease detection.

Details