Author name cluster

Chi Harold Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

AAAI Conference 2026 Conference Paper

Multi-Faceted Attack: Exposing Cross-Model Vulnerabilities in Defense-Equipped Vision-Language Models

Yijun Yang
Lichao Wang
Jianping Zhang
Chi Harold Liu
Lanqing Hong
Qiang Xu

The growing misuse of Vision-Language Models (VLMs) has led providers to deploy multiple safeguards—alignment tuning, system prompt, and content moderation. Yet the real-world robustness of these defenses against adversarial attack remains underexplored. We introduce Multi-Faceted Attack (MFA), a framework that systematically uncovers general safety vulnerabilities in leading defense-equipped VLMs, including GPT-4o, Gemini-Pro, and LLaMA 4, etc. Central to MFA is the Attention-Transfer Attack (ATA), which conceals harmful instructions inside a meta task with competing objectives. We offer a theoretical perspective grounded in reward-hacking to explain why such an attack can succeed. To maximize cross-model transfer, we introduce a lightweight transfer-enhancement algorithm combined with a simple repetition strategy that jointly evades both input- and output-level filters—without any model-specific fine-tuning. We empirically show that adversarial images optimized for one vision encoder transfer broadly to unseen VLMs, indicating that shared visual representations create a cross-model safety vulnerability. Combined, MFA reaches a 58.5% overall attack success rate, consistently outperforming existing methods. Notably, on state-of-the-art commercial models, MFA achieves a 52.8% success rate, outperforming the second-best attack by 34%. These findings challenge the perceived robustness of current defensive mechanisms, systematically expose general safety loopholes within defense-equipped VLMs, and offer a practical probe for diagnosing and evaluating the safety of VLMs.

PDF Details DOI

AAMAS Conference 2025 Conference Paper

Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation

Yinuo Zhao
Kun Wu
Tianjiao Yi
Zhiyuan Xu
Zhengping Che
Chi Harold Liu
Jian Tang

Improving generalization is a key challenge in Embodied AI, where obtaining large-scale datasets from diverse scenarios is costly. Visuomotor policies trained with weak augmentations provide only marginal improvements when applied to new environments. Strong augmentations, such as random overlay, can disrupt task-relevant information and degrade performance. To overcome these challenges, we introduce EAGLE—an Efficient trAining framework for GeneraLizablE visuomotor policies. EAGLE enhances generalization by applying augmentation only to control-related regions using a self-supervised, control-aware mask. It also boosts training efficiency and stability by transferring knowledge from an expert to a student policy, enabling deployment in new environments without further fine-tuning. Experiments on the DMControl Generalization Benchmark (DMC-GB) demonstrate the effectiveness of our approach. Project website at https: //vrl-eagle. github. io/

PDF

IROS Conference 2025 Conference Paper

HACTS: a Human-As-Copilot Teleoperation System for Robot Learning

Zhiyuan Xu
Yinuo Zhao
Kun Wu 0001
Ning Liu 0007
Junjie Ji
Zhengping Che
Chi Harold Liu
Jian Tang 0008

Teleoperation is essential for autonomous robot learning, especially in manipulation tasks that require human demonstrations or corrections. However, most existing systems only offer unilateral robot control and lack the ability to synchronize the robot’s status with the teleoperation hardware, preventing real-time, flexible intervention. In this work, we introduce HACTS (Human-As-Copilot Teleoperation System), a novel system that establishes bilateral, real-time joint synchronization between a robot arm and teleoperation hardware. This simple yet effective feedback mechanism, akin to a steering wheel in autonomous vehicles, enables the human copilot to intervene seamlessly while collecting action-correction data for future learning. Implemented using 3D-printed components and low-cost, off-the-shelf motors, HACTS is both accessible and scalable. Our experiments show that HACTS significantly enhances performance in imitation learning (IL) and reinforcement learning (RL) tasks, boosting IL recovery capabilities and data efficiency, and facilitating human-in-the-loop RL. HACTS paves the way for more effective and interactive human-robot collaboration and data-collection, advancing the capabilities of robot manipulation.

Details

ICLR Conference 2023 Conference Paper

Dirichlet-based Uncertainty Calibration for Active Domain Adaptation

Mixue Xie
Shuang Li 0008
Rui Zhang 0113
Chi Harold Liu

Active domain adaptation (DA) aims to maximally boost the model adaptation on a new target domain by actively selecting limited target data to annotate, whereas traditional active learning methods may be less effective since they do not consider the domain shift issue. Despite active DA methods address this by further proposing targetness to measure the representativeness of target domain characteristics, their predictive uncertainty is usually based on the prediction of deterministic models, which can easily be miscalibrated on data with distribution shift. Considering this, we propose a Dirichlet-based Uncertainty Calibration (DUC) approach for active DA, which simultaneously achieves the mitigation of miscalibration and the selection of informative target samples. Specifically, we place a Dirichlet prior on the prediction and interpret the prediction as a distribution on the probability simplex, rather than a point estimate like deterministic models. This manner enables us to consider all possible predictions, mitigating the miscalibration of unilateral prediction. Then a two-round selection strategy based on different uncertainty origins is designed to select target samples that are both representative of target domain and conducive to discriminability. Extensive experiments on cross-domain image classification and semantic segmentation validate the superiority of DUC.

Details

AAAI Conference 2023 Conference Paper

VBLC: Visibility Boosting and Logit-Constraint Learning for Domain Adaptive Semantic Segmentation under Adverse Conditions

Mingjia Li
Binhui Xie
Shuang Li
Chi Harold Liu
Xinjing Cheng

Generalizing models trained on normal visual conditions to target domains under adverse conditions is demanding in the practical systems. One prevalent solution is to bridge the domain gap between clear- and adverse-condition images to make satisfactory prediction on the target. However, previous methods often reckon on additional reference images of the same scenes taken from normal conditions, which are quite tough to collect in reality. Furthermore, most of them mainly focus on individual adverse condition such as nighttime or foggy, weakening the model versatility when encountering other adverse weathers. To overcome the above limitations, we propose a novel framework, Visibility Boosting and Logit-Constraint learning (VBLC), tailored for superior normal-toadverse adaptation. VBLC explores the potential of getting rid of reference images and resolving the mixture of adverse conditions simultaneously. In detail, we first propose the visibility boost module to dynamically improve target images via certain priors in the image level. Then, we figure out the overconfident drawback in the conventional cross-entropy loss for self-training method and devise the logit-constraint learning, which enforces a constraint on logit outputs during training to mitigate this pain point. To the best of our knowledge, this is a new perspective for tackling such a challenging task. Extensive experiments on two normal-to-adverse domain adaptation benchmarks, i.e., Cityscapes to ACDC and Cityscapes to FoggyCityscapes + RainCityscapes, verify the effectiveness of VBLC, where it establishes the new state of the art. Code is available at https://github.com/BIT-DA/VBLC.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Active Learning for Domain Adaptation: An Energy-Based Approach

Binhui Xie
Longhui Yuan
Shuang Li
Chi Harold Liu
Xinjing Cheng
Guoren Wang

Unsupervised domain adaptation has recently emerged as an effective paradigm for generalizing deep neural networks to new target domains. However, there is still enormous potential to be tapped to reach the fully supervised performance. In this paper, we present a novel active learning strategy to assist knowledge transfer in the target domain, dubbed active domain adaptation. We start from an observation that energy-based models exhibit free energy biases when training (source) and test (target) data come from different distributions. Inspired by this inherent mechanism, we empirically reveal that a simple yet efficient energy-based sampling strategy sheds light on selecting the most valuable target samples than existing approaches requiring particular architectures or computation of the distances. Our algorithm, Energy-based Active Domain Adaptation (EADA), queries groups of target data that incorporate both domain characteristic and instance uncertainty into every selection round. Meanwhile, by aligning the free energy of target data compact around the source domain via a regularization term, domain gap can be implicitly diminished. Through extensive experiments, we show that EADA surpasses state-of-the-art methods on well-known challenging benchmarks with substantial improvements, making it a useful option in the open world. Code is available at https: //github. com/BIT-DA/EADA.

PDF Details

AAAI Conference 2022 Conference Paper

CADRE: A Cascade Deep Reinforcement Learning Framework for Vision-Based Autonomous Urban Driving

Yinuo Zhao
Kun Wu
Zhiyuan Xu
Zhengping Che
Qi Lu
Jian Tang
Chi Harold Liu

Vision-based autonomous urban driving in dense traffic is quite challenging due to the complicated urban environment and the dynamics of the driving behaviors. Widely-applied methods either heavily rely on hand-crafted rules or learn from limited human experience, which makes them hard to generalize to rare but critical scenarios. In this paper, we present a novel CAscade Deep REinforcement learning framework, CADRE, to achieve model-free vision-based autonomous urban driving. In CADRE, to derive representative latent features from raw observations, we first offline train a Co-attention Perception Module (CoPM) that leverages the co-attention mechanism to learn the inter-relationships between the visual and control information from a pre-collected driving dataset. Cascaded by the frozen CoPM, we then present an efficient distributed proximal policy optimization framework to online learn the driving policy under the guidance of particularly designed reward functions. We perform a comprehensive empirical study with the CARLA NoCrash benchmark as well as specific obstacle avoidance scenarios in autonomous urban driving tasks. The experimental results well justify the effectiveness of CADRE and its superiority over the state-of-the-art by a wide margin.

PDF Details

AAAI Conference 2021 Conference Paper

Bi-Classifier Determinacy Maximization for Unsupervised Domain Adaptation

Shuang Li
Fangrui Lv
Binhui Xie
Chi Harold Liu
Jian Liang
Chen Qin

Unsupervised domain adaptation challenges the problem of transferring knowledge from a well-labelled source domain to an unlabelled target domain. Recently, adversarial learning with bi-classifier has been proven effective in pushing crossdomain distributions close. Prior approaches typically leverage the disagreement between bi-classifier to learn transferable representations, however, they often neglect the classifier determinacy in the target domain, which could result in a lack of feature discriminability. In this paper, we present a simple yet effective method, namely Bi-Classifier Determinacy Maximization (BCDM), to tackle this problem. Motivated by the observation that target samples cannot always be separated distinctly by the decision boundary, here in the proposed BCDM, we design a novel classifier determinacy disparity (CDD) metric, which formulates classifier discrepancy as the class relevance of distinct target predictions and implicitly introduces constraint on the target feature discriminability. To this end, the BCDM can generate discriminative representations by encouraging target predictive outputs to be consistent and determined, meanwhile, preserve the diversity of predictions in an adversarial manner. Furthermore, the properties of CDD as well as the theoretical guarantees of BCDM’s generalization bound are both elaborated. Extensive experiments show that BCDM compares favorably against the existing state-of-the-art domain adaptation methods.

PDF Details

NeurIPS Conference 2021 Conference Paper

Pareto Domain Adaptation

Fangrui Lv
Jian Liang
Kaixiong Gong
Shuang Li
Chi Harold Liu
Han Li
Di Liu
Guoren Wang

Domain adaptation (DA) attempts to transfer the knowledge from a labeled source domain to an unlabeled target domain that follows different distribution from the source. To achieve this, DA methods include a source classification objective to extract the source knowledge and a domain alignment objective to diminish the domain shift, ensuring knowledge transfer. Typically, former DA methods adopt some weight hyper-parameters to linearly combine the training objectives to form an overall objective. However, the gradient directions of these objectives may conflict with each other due to domain shift. Under such circumstances, the linear optimization scheme might decrease the overall objective value at the expense of damaging one of the training objectives, leading to restricted solutions. In this paper, we rethink the optimization scheme for DA from a gradient-based perspective. We propose a Pareto Domain Adaptation (ParetoDA) approach to control the overall optimization direction, aiming to cooperatively optimize all training objectives. Specifically, to reach a desirable solution on the target domain, we design a surrogate loss mimicking target classification. To improve target-prediction accuracy to support the mimicking, we propose a target-prediction refining mechanism which exploits domain labels via Bayes’ theorem. On the other hand, since prior knowledge of weighting schemes for objectives is often unavailable to guide optimization to approach the optimal solution on the target domain, we propose a dynamic preference mechanism to dynamically guide our cooperative optimization by the gradient of the surrogate loss on a held-out unlabeled target dataset. Our theoretical analyses show that the held-out data can guide but will not be over-fitted by the optimization. Extensive experiments on image classification and semantic segmentation benchmarks demonstrate the effectiveness of ParetoDA

PDF Details

TIST Journal 2015 Journal Article

An Event-Driven QoI-Aware Participatory Sensing Framework with Energy and Budget Constraints

Bo Zhang
Zheng Song
Chi Harold Liu
Jian Ma
Wendong Wang

Participatory sensing systems can be used for concurrent event monitoring applications, like noise levels, fire, and pollutant concentrations. However, they are facing new challenges as to how to accurately detect the exact boundaries of these events, and further, to select the most appropriate participants to collect the sensing data. On the one hand, participants’ handheld smart devices are constrained with different energy conditions and sensing capabilities, and they move around with uncontrollable mobility patterns in their daily life. On the other hand, these sensing tasks are within time-varying quality-of-information (QoI) requirements and budget to afford the users’ incentive expectations. Toward this end, this article proposes an event-driven QoI-aware participatory sensing framework with energy and budget constraints. The main method of this framework is event boundary detection. For the former, a two-step heuristic solution is proposed where the coarse-grained detection step finds its approximation and the fine-grained detection step identifies the exact location. Participants are selected by explicitly considering their mobility pattern, required QoI of multiple tasks, and users’ incentive requirements, under the constraint of an aggregated task budget. Extensive experimental results, based on a real trace in Beijing, show the effectiveness and robustness of our approach, while comparing with existing schemes.

Details DOI