Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability

Jiani Liu; Zhiyuan Wang; Zeliang Zhang; Chao Huang; Susan Liang; Yunlong Tang; Chenliang Xu

Back to NeurIPS

NeurIPS 2025

Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Vision Transformers (ViTs) have demonstrated impressive performance across a range of applications, including many safety-critical tasks. Many previous studies have observed that adversarial examples crafted on ViTs exhibit higher transferability than those crafted on CNNs, indicating that ViTs contain structural characteristics favorable for transferable attacks. In this work, we take a further step to deeply investigate the role of computational redundancy brought by its unique characteristics in ViTs and its impact on adversarial transferability. Specifically, we identify two forms of redundancy, including the data-level and model-level, that can be harnessed to amplify attack effectiveness. Building on this insight, we design a suite of techniques, including attention sparsity manipulation, attention head permutation, clean token regularization, ghost MoE diversification, and learn to robustify before the attack. A dynamic online learning strategy is also proposed to fully leverage these operations to enhance the adversarial transferability. Extensive experiments on the ImageNet-1k dataset validate the effectiveness of our approach, showing that our methods significantly outperform existing baselines in both transferability and generality across diverse model architectures, including different variants of ViTs and mainstream Vision Large Language Models (VLLMs).

Harnessing the Computation Redundancy in ViTs to Boost Adversarial Transferability

Abstract

Authors

Keywords

Context