Arrow Research search

Author name cluster

Wei Hao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
2 author rows

Possible papers

8

ICLR Conference 2025 Conference Paper

I Can Hear You: Selective Robust Training for Deepfake Audio Detection

  • Zirui Zhang
  • Wei Hao
  • Aroon Sankoh
  • William Lin
  • Emanuel Mendiola-Ortiz
  • Junfeng Yang
  • Chengzhi Mao

Recent advances in AI-generated voices have intensified the challenge of detecting deepfake audio, posing risks for scams and the spread of disinformation. To tackle this issue, we establish the largest public voice dataset to date, named DeepFakeVox-HQ, comprising 1.3 million samples, including 270,000 high-quality deepfake samples from 14 diverse sources. Despite previously reported high accuracy, existing deepfake voice detectors struggle with our diversely collected dataset, and their detection success rates drop even further under realistic corruptions and adversarial attacks. We conduct a holistic investigation into factors that enhance model robustness and show that incorporating a diversified set of voice augmentations is beneficial. Moreover, we find that the best detection models often rely on high-frequency features, which are imperceptible to humans and can be easily manipulated by an attacker. To address this, we propose the F-SAT: Frequency-Selective Adversarial Training method focusing on high-frequency components. Empirical results demonstrate that using our training dataset boosts baseline model performance (without robust training) by 33%, and our robust training further improves accuracy by 7.7% on clean samples and by 29.3% on corrupted and attacked samples, over the state-of-the-art RawNet3 model.

IROS Conference 2025 Conference Paper

MADI: Malicious Agent Detection and Isolation in Mixed Autonomy Traffic Systems

  • Wei Hao
  • Huaping Liu
  • Wenjie Li
  • Chang Gou
  • Lijun Chen

Mixed autonomy traffic systems face significant security challenges when malicious agents disrupt coordination between autonomous and human-driven vehicles. We present Malicious Agent Detection and Isolation (MADI), a framework addressing two critical forms of disruptive behavior: path order violations at coordination points and strategic congestion generation. MADI integrates dual-mechanism detection with temporal consistency analysis to identify sophisticated malicious behaviors while filtering transient anomalies that could trigger false positives. Upon detection, our framework employs adaptive isolation strategies including enlarged safety boundaries and dynamic priority adjustment. Extensive experiments in simulated highway and urban environments demonstrate that MADI achieves up to 91% detection accuracy with only 4% false positives, significantly outperforming rule-based, anomaly-based, and single-criterion methods. The framework reduces travel time impacts by 25. 5% and near-collision events by 76. 5% in adversarial conditions, demonstrating its effectiveness for enhancing safety and efficiency in mixed autonomy traffic.

ICRA Conference 2024 Conference Paper

DL-PoseNet: A Differential Lightweight Network for Pose Regression over SE(3)

  • Wenjie Li 0002
  • Jia Liu 0008
  • Yanyan Wang 0001
  • Wei Hao
  • Dayong Ren
  • Lijun Chen 0006

Accurate pose estimation over SE(3) is fundamentally crucial for numerous perception tasks, including camera re-localization. While existing learning-based methods estimated from a series of RGB images have significantly improved the accuracy of pose, the majority of models still face one or two limitations. First, few representations on SE(3) are smooth and differential, making them difficult to apply in deep learning frameworks. Second, they often require high computational resources due to complex deep network designs. We in this paper propose the DL-PoseNet to address these issues. Specifically, we present a novel representation for SE(3) which follows the property of smoothness of the pose. We then design a lightweight neural network to regress the pose by developing a differential pose layer. Finally, we introduce a novel loss function and gradient descent method to better supervise the proposed lightweight pose network. Extensive experiments on the camera re-localization task on the Cambridge Landmarks and 7-Scenes datasets demonstrate the superior predictive accuracy and benefits of our method in comparison with the state-of-the-art.

ICML Conference 2024 Conference Paper

MGit: A Model Versioning and Management System

  • Wei Hao
  • Daniel Mendoza
  • Rafael Mendes
  • Deepak Narayanan
  • Amar Phanishayee
  • Asaf Cidon
  • Junfeng Yang

New ML models are often derived from existing ones (e. g. , through fine-tuning, quantization or distillation), forming an ecosystem where models are related to each other and can share structure or even parameter values. Managing such a large and evolving ecosystem of model derivatives is challenging. For instance, the overhead of storing all such models is high, and models may inherit bugs from related models, complicating error attribution and debugging. In this paper, we propose a model versioning and management system called MGit that makes it easier to store, test, update, and collaborate on related models. MGit introduces a lineage graph that records the relationships between models, optimizations to efficiently store model parameters, and abstractions over this lineage graph that facilitate model testing, updating and collaboration. We find that MGit works well in practice: MGit is able to reduce model storage footprint by up to 7$\times$. Additionally, in a user study with 20 ML practitioners, users complete a model updating task 3$\times$ faster on average with MGit.

AAAI Conference 2024 Conference Paper

UCMCTrack: Multi-Object Tracking with Uniform Camera Motion Compensation

  • Kefu Yi
  • Kai Luo
  • Xiaolei Luo
  • Jiangui Huang
  • Hao Wu
  • Rongdong Hu
  • Wei Hao

Multi-object tracking (MOT) in video sequences remains a challenging task, especially in scenarios with significant camera movements. This is because targets can drift considerably on the image plane, leading to erroneous tracking outcomes. Addressing such challenges typically requires supplementary appearance cues or Camera Motion Compensation (CMC). While these strategies are effective, they also introduce a considerable computational burden, posing challenges for real-time MOT. In response to this, we introduce UCMCTrack, a novel motion model-based tracker robust to camera movements. Unlike conventional CMC that computes compensation parameters frame-by-frame, UCMCTrack consistently applies the same compensation parameters throughout a video sequence. It employs a Kalman filter on the ground plane and introduces the Mapped Mahalanobis Distance (MMD) as an alternative to the traditional Intersection over Union (IoU) distance measure. By leveraging projected probability distributions on the ground plane, our approach efficiently captures motion patterns and adeptly manages uncertainties introduced by homography projections. Remarkably, UCMCTrack, relying solely on motion cues, achieves state-of-the-art performance across a variety of challenging datasets, including MOT17, MOT20, DanceTrack and KITTI. More details and code are available at https://github.com/corfyi/UCMCTrack.

UAI Conference 2022 Conference Paper

PDQ-Net: Deep probabilistic dual quaternion network for absolute pose regression on SE(3)

  • Wenjie Li 0002
  • Wasif Naeem
  • Jia Liu 0008
  • Dequan Zheng
  • Wei Hao
  • Lijun Chen 0006

Accurate absolute pose regression is one of the key challenges in robotics and computer vision. Existing direct regression methods suffer from two limitations. First, some noisy scenarios such as poor illumination conditions are likely to result in the uncertainty of pose estimation. Second, the output n-dimensional feature vector in the Euclidean space $\mathbb{R}^n$ cannot be well mapped to $SE(3)$ manifold. In this work, we propose a deep dual quaternion network that performs the absolute pose regression on $SE(3)$. We first develop an antipodally symmetric probability distribution over the unit dual quaternion on $SE(3)$ to model uncertainties and then propose an intermediary differential representation space to replace the final output pose, which avoids the mapping problem from $\mathbb{R}^n$ to $SE(3)$. In addition, we introduce a backpropagation method that considers the continuousness and differentiability of the proposed intermediary space. Extensive experiments on the camera re-localization task on the Cambridge Landmarks and 7-Scenes datasets demonstrate that our method greatly improves the accuracy of the pose as well as the robustness in dealing with uncertainty and ambiguity, compared to the state-of-the-art.

ICRA Conference 2022 Conference Paper

Pose Estimation based on a Dual Quaternion Feedback Particle Filter

  • Wenjie Li 0002
  • Wasif Naeem
  • Wenhao Ji
  • Jia Liu 0008
  • Wei Hao
  • Lijun Chen 0006

Fast and accurate pose estimation is essential for many robotic applications such as SLAM, manipulation, and 3D point registration. Existing solutions to this problem suffer from either high computation overhead due to the nonlinear features or accuracy loss due to linear approximation. In this paper, we propose a dual quaternion feedback particle filter (DQFPF) that can capture the nonlinear factors in the observation model and use the optimal control theory to estimate the pose. To avoid particle degeneracy caused by sequential importance sampling and resampling, we present a feedback particle update formula to speed up the optimization with fewer particles being sampled. Simulation results show that in known corresponding cases our approach can converge to the correct pose more efficiently than the state-of-the-art. A similar conclusion can also be drawn in real applications of unknown corresponding cases, i. e. , point cloud stitching and visual odometry estimation.