Author name cluster

Ruizhe Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

ICML Conference 2025 Conference Paper

Optimizing Large Language Model Training Using FP4 Quantization

Ruizhe Wang
Yeyun Gong
Xiao Liu 0029
Guoshuai Zhao
Ziyue Yang
Baining Guo
Zheng-Jun Zha
Peng Cheng 0005

The growing computational demands of training large language models (LLMs) necessitate more efficient methods. Quantized training presents a promising solution by enabling low-bit arithmetic operations to reduce these costs. While FP8 precision has demonstrated feasibility, leveraging FP4 remains a challenge due to significant quantization errors and limited representational capacity. This work introduces the first FP4 training framework for LLMs, addressing these challenges with two key innovations: a differentiable quantization estimator for precise weight updates and an outlier clamping and compensation strategy to prevent activation collapse. To ensure stability, the framework integrates a mixed-precision training scheme and vector-wise quantization. Experimental results demonstrate that our FP4 framework achieves accuracy comparable to BF16 and FP8, with minimal degradation, scaling effectively to 13B-parameter LLMs trained on up to 100B tokens. With the emergence of next-generation hardware supporting FP4, our framework sets a foundation for efficient ultra-low precision training.

Details

JBHI Journal 2025 Journal Article

SkipDAEformer: A High-Precision Representation Learning Method for Removing Random Mixed Noise in MCG Signals

Ruizhe Wang
Zhanyi Liu
Jiaojiao Pang
Jie Sun
Min Xiang
Xiaolin Ning

Automated analytical techniques for magnetocardiography (MCG) are essential for diagnosing and predicting cardiovascular diseases. Clinically acquired MCG signals are often contaminated by various types of noise, which negatively impact subsequent signal analysis. However, traditional methods have limitations in denoising long-term MCG signals with complex spatial structures. We propose a high-precision, robust representation learning method based on skip connection multi-scale feature fusion (SkipDAEformer) for effectively removing random mixed noise in MCG signals. SkipDAEformer integrates attention fusion mechanisms into a basic denoising autoencoder to extract and fuse critical temporal and spatial information from each feature map, thus enhancing the model’s ability to capture long-range dependencies and spatial features in MCG signals. Meanwhile, we further supplement and refine the semantic information for the feature maps through a global feature fusion method. By fusing multi-scale features from different skip connections, SkipDAEformer can learn more comprehensive representations of MCG signals, enabling the effective separation of clean signals from noise. Experimental results demonstrate that SkipDAEformer outperforms existing methods in denoising performance, channel consistency, feature consistency, and generalization ability and can be extended to a self-supervised learning framework. In actual noise reduction and diagnostic classification tasks, SkipDAEformer shows superior clinical acceptability and diagnostic value, potentially advancing MCG data analysis.

Details DOI

AAAI Conference 2021 Conference Paper

Sequential Attacks on Kalman Filter-based Forward Collision Warning Systems

Yuzhe Ma
Jon A Sharp
Ruizhe Wang
Earlence Fernandes
Xiaojin Zhu

Kalman Filter (KF) is widely used in various domains to perform sequential learning or variable estimation. In the context of autonomous vehicles, KF constitutes the core component of many Advanced Driver Assistance Systems (ADAS), such as Forward Collision Warning (FCW). It tracks the states (distance, velocity etc.) of relevant traffic objects based on sensor measurements. The tracking output of KF is often fed into downstream logic to produce alerts, which will then be used by human drivers to make driving decisions in near-collision scenarios. In this paper, we study adversarial attacks on KF as part of the more complex machine-human hybrid system of Forward Collision Warning. Our attack goal is to negatively affect human braking decisions by causing KF to output incorrect state estimations that lead to false or delayed alerts. We accomplish this by sequentially manipulating measurements fed into the KF, and propose a novel Model Predictive Control (MPC) approach to compute the optimal manipulation. Via experiments conducted in a simulated driving environment, we show that the attacker is able to successfully change FCW alert signals through planned manipulation over measurements prior to the desired target time. These results demonstrate that our attack can stealthily mislead a distracted human driver and cause vehicle collisions.

PDF Details