Author name cluster

Ziqi Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

AAAI Conference 2026 Conference Paper

LoRA in LoRA: Towards Parameter-Efficient Architecture Expansion for Continual Visual Instruction Tuning

Chang Che
Ziqi Wang
Pengwan Yang
Cheems Wang
Hui Ma
Zenglin Shi

Continual Visual Instruction Tuning (CVIT) enables Multimodal Large Language Models (MLLMs) to incrementally learn new tasks over time. However, this process is challenged by catastrophic forgetting, where performance on previously learned tasks deteriorates as the model adapts to new ones. A common approach to mitigate forgetting is architecture expansion, which introduces task-specific modules to prevent interference. Yet, existing methods often expand entire layers for each task, leading to significant parameter overhead and poor scalability. To overcome these issues, we introduce LoRA in LoRA (LiLoRA), a highly efficient architecture expansion method tailored for CVIT in MLLMs. LiLoRA shares the LoRA matrix A across tasks to reduce redundancy, applies an additional low-rank decomposition to matrix B to minimize task-specific parameters, and incorporates a cosine-regularized stability loss to preserve consistency in shared representations over time. Extensive experiments on a diverse CVIT benchmark show that LiLoRA consistently achieves superior performance in sequential task learning while significantly improving parameter efficiency compared to existing approaches.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Learning Intractable Multimodal Policies with Reparameterization and Diversity Regularization

Ziqi Wang
Jiashun Liu
Ling Pan

Traditional continuous deep reinforcement learning (RL) algorithms employ deterministic or unimodal Gaussian actors, which cannot express complex multimodal decision distributions. This limitation can hinder their performance in diversity-critical scenarios. There have been some attempts to design online multimodal RL algorithms based on diffusion or amortized actors. However, these actors are intractable, making existing methods struggle with balancing performance, decision diversity, and efficiency simultaneously. To overcome this challenge, we first reformulate existing intractable multimodal actors within a unified framework, and prove that they can be directly optimized by policy gradient via reparameterization. Then, we propose a distance-based diversity regularization that does not explicitly require decision probabilities. We identify two diversity-critical domains, namely multi-goal achieving and generative RL, to demonstrate the advantages of multimodal policies and our method, particularly in terms of few-shot robustness. In conventional MuJoCo benchmarks, our algorithm also shows competitive performance. Moreover, our experiments highlight that the amortized actor is a promising policy model class with strong multimodal expressivity and high performance. Our code is available at https: //github. com/PneuC/DrAC

PDF Details

IROS Conference 2025 Conference Paper

Model-Free Catheter Delivery Strategy for Robotic Transcatheter Tricuspid Valve Replacement

Haichuan Lin
Yiping Xie
Ziqi Wang
Dong Chen
Longyue Tan
Weizhao Wang
Yuen Chiu Ng
Xilong Hou 0001

Transcatheter tricuspid valve replacement (TTVR) has emerged as a promising minimally invasive procedure for treating severe tricuspid regurgitation (TR). However, accurate catheter delivery remains a significant challenge, primarily due to the reliance on 2D vision feedback, complex catheter kinematics, camera-to-robot pose calibration, which are difficult to generalize across patients. To address these issues, this paper presents a model-free robotic catheter delivery strategy for TTVR using Data-Enabled Predictive Control (DeePC). This approach leverages data-driven control to optimize catheter positioning without the need for prior knowledge of the system’s dynamics, eliminating the need for complex kinematic models or camera calibration. The proposed method incorporates environmental constraints to ensure the safety of the procedure, delivering the catheter to the desired location with high accuracy across varying catheters and camera poses. Experimental results demonstrate the effectiveness and versatility of the approach, suggesting its potential for broader applications in robotic-assisted surgeries. This work presents a new perspective for vision based robotic TTVR, as well as other clinical interventions involving robotic catheter control.

Details

IROS Conference 2025 Conference Paper

Physical Human-Robot Collaboration-Assisted Acetabular Preparation for Total Hip Replacement Surgery

Ziqi Wang
Tiancheng Li 0003
Marc G. Carmichael
Shoudong Huang

When performing total hip replacement (THR) surgery, high-quality preparation of acetabulum is critical as it contributes to the patient’s recovery speed and the consistency of bone ingrowth. Conventionally, surgeons prepare the acetabulum manually by reaming it with a handheld electric drill and a reamer. It not only increases the surgeon’s workload but more importantly, it is difficult to control the reaming depth and direction accurately. Utilizing an admittance-controlled (AC) collaborative robot (cobot) to enable physical human-robot collaboration (pHRC) possesses a promising solution. For primitive AC, a compromise must be made between compliance and task accuracy. In this paper, we present a novel variable admittance control (VAC) design that considers the reactive force of bone while ensuring the passivity and stability of the system during pHRC-assisted acetabular preparation. The qualitative results show that VAC was more desirable by users than the conventional manual reaming method. Compared to other pHRC controls, quantitative results on user energy consumption, reaming error, and smoothness showed the proposed VAC can achieve a balance between physical workload and acetabular quality. Compared to manual reaming, VAC reduced the reaming error by 67. 47% and improved the final acetabulum surface smoothness by 18. 30%.

Details

TMLR Journal 2025 Journal Article

Understanding Emergent In-Context Learning from a Kernel Regression Perspective

Chi Han
Ziqi Wang
Han Zhao
Heng Ji

Large language models (LLMs) have initiated a paradigm shift in transfer learning. In contrast to the classic pretraining-then-finetuning procedure, in order to use LLMs for downstream prediction tasks, one only needs to provide a few demonstrations, known as in-context examples, without adding more or updating existing model parameters. This in-context learning (ICL) capability of LLMs is intriguing, and it is not yet fully understood how pretrained LLMs acquire such capabilities. In this paper, we investigate the reason why a transformer-based language model can accomplish in-context learning after pre-training on a general language corpus by proposing a kernel-regression perspective of understanding LLMs' ICL behaviors when faced with in-context examples. More concretely, we first prove that Bayesian inference on in-context prompts can be asymptotically understood as kernel regression $\hat y = \sum_i y_i K(x, x_i)/\sum_i K(x, x_i)$ as the number of in-context demonstrations grows. Then, we empirically investigate the in-context behaviors of language models. We find that during ICL, the attention and hidden features in LLMs match the behaviors of a kernel regression. Finally, our theory provides insights into multiple phenomena observed in the ICL field: why retrieving demonstrative samples similar to test samples can help, why ICL performance is sensitive to the output formats, and why ICL accuracy benefits from selecting in-distribution and representative samples.

PDF Details

IROS Conference 2025 Conference Paper

Using Upper Limb Carrying Exoskeleton with Dual-Model Torque Control Strategy to Reduce Load Impact

Daming Liu
Ye Li
Junchen Liu
Ziqi Wang
Jie Zhao 0003
Yanhe Zhu

Exoskeleton technology holds significant promise within the human-centric paradigm of Industry 5. 0 for mitigating work-related musculoskeletal disorders (WMSDs). However, existing systems often struggle with mismatched assistive torque and inefficient human-machine collaboration under dynamic loading conditions, largely due to insufficient motion intent recognition accuracy. This study proposes a dual-model-based multimodal fusion control strategy that integrates a bidirectional LSTM neural network (Bi-LSTM) with a transformer-based multi-task learning model (MTL) to enable real-time torque compensation and accurate prediction of dynamic load mass under varying conditions. The team developed a lightweight elbow joint exoskeleton prototype, leveraging multi-modal information to enhance assistive torque prediction accuracy. Experimental results show an 83. 7% reduction in agonist muscle activation under a 3. 5 kg load compared to conditions without the exoskeleton, underscoring its potential for industrial material handling scenarios.

Details

EAAI Journal 2024 Journal Article

CADLRA: A multi-charge prediction method based on the Criminal Act-Driven Law Retrieval Augmentation

Jianzhou Feng
Lazhi Zhao
Haonan Qin
Yiming Xu
Ziqi Wang

Legal Artificial Intelligence (Legal AI) has garnered significant attention in both academic and industrial domains in recent years. However, most legal judgment prediction (LJP) methods concentrate on single-charge prediction tasks, ignoring the practical scenario of “one person with multiple charges”. To mitigate this limitation, we propose a multi-charge prediction method based on the Criminal Act-Driven Law Retrieval Augmentation (CADLRA), which utilizes Large Language Models (LLMs) to convert the multi-charge classification task into a dynamic multi-charge generation task, achieving enhanced prediction accuracy. To address knowledge solidification and hallucination in LLMs and align with the legal process of sentencing based on criminal acts and relevant laws, we employ contrastive learning to train a retriever to aid LLMs in charge prediction by referencing prior law articles. Finally, experiments were conducted using the public dataset from the Legal AI Challenge, demonstrating that the CADLRA method has achieved state-of-the-art results in both multi-label classification algorithms and charge prediction.

Details DOI

ICRA Conference 2024 Conference Paper

Comparison of Rating Scale and Pairwise Comparison Methods for Measuring Human Co-worker Subjective Impression of Robot during Physical Human-Robot Collaboration

Qiao Wang
Ziqi Wang
Marc G. Carmichael
Dikai Liu
Chin-Teng Lin

The Rating Scale method has been long deemed the standard for measuring subjective perceptions. However, in the field of physical human-robot collaboration (pHRC), its aptness should be put under scrutiny due to inherent challenges such as response bias, between-subject variations, and the granularity nature. Individual variances can introduce significant bias in the rating scale results. A high granularity in the scale could overwhelm participants, leading to unclear and biased responses, while a low granularity may gloss over the fine nuances of human feelings. Additionally, there’s a notable risk of receiving careless responses, which compromise data reliability. Recognizing these challenges, this paper proposes the application of Pairwise Comparison (PC) in pHRC — an alternative survey technique that emphasizes direct comparisons between items on the defined criteria. By using the NASA Task Load Index (NASA-TLX) as a template, RS and PC questionnaires are designed and used in a series of pHRC experiments. Our preliminary findings suggest that PC is more precise and robust than the rating scale method. Compared to RS, PC fosters authentic participant interests in the experiment by intuitive question design and reducing the experimental duration. Besides, the accuracy and reliability of PC are also found to be consistent regardless of the variations in our experimental procedure design.

Details

ICRA Conference 2024 Conference Paper

Exploring the Effect of Base Compliance on Physical Human-Robot Collaboration

Ziqi Wang
Marc G. Carmichael

Mobile physical human-robot collaboration (pHRC) using collaborative robots (cobots) and mobile robots has attracted much research attention. Many researchers have focused on improving the control performance to comply with human intentions. However, a problem that generally exists with mobile pHRC but often gets neglected is the impact of non-rigid components e. g. deformable tyres, suspension systems and uneven terrain on human interaction experience and task performance. To fullfil this current research gap, we carried out an investigation on the above-mentioned problem by altering a cobot’s base rigidity level (also referred to as base compliance level or BCL) during pHRC experiments. We explored how the task performance is affected by base compliance as well as human operator’s experience and cobot control parameters. Measurements include the human operator’s physical effort, task velocity, and task error. From the experimental results, it is discovered that base compliance has a significant impact on task accuracy as it can easily excite the system if an inadequate control strategy is deployed. Furthermore, through ANOVA, it is discovered that the influence of base compliance can be minimized and system excitation can be avoided by sufficient human operator training and the appropriate selection of cobot’s control parameters.

Details

ICRA Conference 2024 Conference Paper

Human-Exoskeleton Locomotion Interaction Experience Transfer: Speeding up and Improving the Performance of Preference-based Optimizations of Exoskeleton Assistance During Walking

Hongwu Li
Junchen Liu
Ziqi Wang
Haotian Ju
Tianjiao Zheng
Yongsheng Gao 0002
Jie Zhao 0003
Yanhe Zhu

Preference-based optimizing methods have shown their advantages and potential in exploring individual, comfortable, and effective control strategies and assistance parameters of exoskeletons during locomotion. Research indicates that compared with naive wearers, knowledgeable wearers with abundant exoskeleton assistance experience have obvious advantages in speeding up the parameters exploration process and improving the assistant performance. However, there is no existing method that could utilize the human-exoskeleton locomotion interaction experience (HELIE) to assist naive wearers during the exploration process. In this work, we propose a novel preference-based human-exoskeleton locomotion interaction experience transfer (LIET) framework, which could speed up the exploration of human-preferred parameters and acquire more satisfying results for naive wearers via the HELIE acquired from knowledgeable wearers. In addition, based on the proposed LIET framework, we establish the mathematical expression of the HELIE transfer during exoskeleton assistance. This will promote the research that concerns utilizing HELIE for exoskeleton control parameters optimizations in the future. Finally, experiments demonstrate the proposed LIET framework could speed up the exploration process and acquire more satisfying optimized results for naive wearers.

Details

IROS Conference 2024 Conference Paper

Using Hip Assisted Running Exoskeleton with Impact Isolation Mechanism to Improve Energy Efficiency

Ziqi Wang
Junchen Liu
Hongwu Li
Qinghua Zhang
Xianglong Li
Yi Huang
Haotian Ju
Tianjiao Zheng

Research has indicated that exoskeletons can assist human movement, but due to the influence of additional weight and challenges in control strategy design, only a few exoskeletons effectively reduce the wearers’ metabolic costs during running. This paper proposes an innovative and efficient hip-assisted running exoskeleton (HARE) designed to facilitate the flexion and extension movements of the joint along the sagittal plane. In the field of structural engineering, we propose implementing an active-passive combination constant force suspension system, hereinafter referred to as CFS, to effectively mitigate the impact of inertial forces during running. The decoupled transmission mechanism allows the CFS and assist mechanisms to operate independently, ensuring the tension of the cables. The flexible structural design can reduce the locomotion limitation on human bodies and reduce the additional energy burden on the body. In control strategy designing, the joint torque-generating strategy provides personalized assistance strategies for wearers to actively optimize the control parameters. Meanwhile, the safety control strategy based on abnormal gait recognition can ensure human safety. Experiments have shown that compared to not wearing exoskeletons, this device can reduce the energy consumption of the human body by 5. 33 % at a speed of 9 km/h. This demonstrates its potential in human motion assistance processes.

Details

JBHI Journal 2022 Journal Article

Assessment of Parkinson’s Disease Severity From Videos Using Deep Architectures

Zhao Yin
Victor J. Geraedts
Ziqi Wang
Maria Fiorella Contarino
Hamdi Dibeklioglu
Jan van Gemert

Parkinson’s disease ( PD ) diagnosis is based on clinical criteria, i. e. , bradykinesia, rest tremor, rigidity, etc. Assessment of the severity of PD symptoms with clinical rating scales, however, is subject to inter-rater variability. In this paper, we propose a deep learning based automatic PD diagnosis method using videos to assist the diagnosis in clinical practices. We deploy a 3D Convolutional Neural Network ( CNN ) as the baseline approach for the PD severity classification and show the effectiveness. Due to the lack of data in clinical field, we explore the possibility of transfer learning from non-medical dataset and show that PD severity classification can benefit from it. To bridge the domain discrepancy between medical and non-medical datasets, we let the network focus more on the subtle temporal visual cues, i. e. , the frequency of tremors, by designing a Temporal Self-Attention ( TSA ) mechanism. Seven tasks from the Movement Disorders Society - Unified PD rating scale ( MDS-UPDRS ) part III are investigated, which reveal the symptoms of bradykinesia and postural tremors. Furthermore, we propose a multi-domain learning method to predict the patient-level PD severity through task-assembling. We show the effectiveness of TSA and task-assembling method on our PD video dataset empirically. We achieve the best MCC of 0. 55 on binary task-level and 0. 39 on three-class patient-level classification.

Details DOI

ICLR Conference 2022 Conference Paper

FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations

Lingjie Mei
Jiayuan Mao
Ziqi Wang
Chuang Gan 0001
Joshua B. Tenenbaum

We present a meta-learning framework for learning new visual concepts quickly, from just one or a few examples, guided by multiple naturally occurring data streams: simultaneously looking at images, reading sentences that describe the objects in the scene, and interpreting supplemental sentences that relate the novel concept with other concepts. The learned concepts support downstream applications, such as answering questions by reasoning about unseen images. Our model, namely FALCON, represents individual visual concepts, such as colors and shapes, as axis-aligned boxes in a high-dimensional space (the ``box embedding space''). Given an input image and its paired sentence, our model first resolves the referential expression in the sentence and associates the novel concept with particular objects in the scene. Next, our model interprets supplemental sentences to relate the novel concept with other known concepts, such as ``X has property Y'' or ``X is a kind of Y''. Finally, it infers an optimal box embedding for the novel concept that jointly 1) maximizes the likelihood of the observed instances in the image, and 2) satisfies the relationships between the novel concepts and the known ones. We demonstrate the effectiveness of our model on both synthetic and real-world datasets.

Details