Arrow Research search

Author name cluster

Yihan Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers
2 author rows

Possible papers

21

EAAI Journal 2026 Journal Article

A Progressive Multilevel Mixing-based knowledge distillation framework for enhancing three-dimensional object detection on compressed point clouds

  • Shuo Zhu
  • Yongfang Wang
  • Wei Chen
  • Yihan Wang

Light Detection and Ranging (LiDAR)-based three-dimensional (3D) object detection has achieved notable progress and is widely applied in areas including autonomous driving scenarios and robotics. However, point clouds often face compression and transmission distortion, which poses a significant challenge to existing point cloud-based object detection models. We find that object detection performance suffers a sharp decline in high-compression point cloud scenarios due to the scarcity of compressed point clouds. To tackle this problem, we first propose a Progressive Multilevel Mixing (PMM) method to enable the model learning continually from a stream of compressed point cloud datasets, in which the compressed point clouds are mixed with the source point clouds at both the scene and object levels. We then propose a knowledge distillation (KD) framework for point cloud-based object detection that selectively learns at key locations based on the characteristics of 3D tasks, effectively allowing the student model to absorb crucial information from the corresponding teacher model. Specifically, we establish a Compressed Point Cloud Dataset for 3D Object Detection (CPC-3DOD) with rich and diverse range of point cloud scenes and five different compression ratios. As far as we know, this is the first study to contribute a large-scale database specifically designed for compressed point cloud detection. Extensive empirical evaluations have substantiated the effectiveness of our method, demonstrating marked performance improvements on the CPC-3DOD dataset. We make the CPC-3DOD dataset publicly available at https: //github. com/and-star/CPC-3DOD. • A new compressed point cloud dataset is created for 3D object detection research. • The novel mixing algorithm generates mixed point clouds to adapt models to compression. • The distillation framework transfers rich knowledge from the teacher to the student model. • Experimental results demonstrate the superior performance of the proposed method.

AAAI Conference 2026 Conference Paper

ResMAS: Resilience Optimization in LLM-based Multi-agent Systems

  • Zhilun Zhou
  • Zihan Liu
  • Jiahe Liu
  • Qingyu Shao
  • Yihan Wang
  • Kun Shao
  • Depeng Jin
  • Fengli Xu

Large Language Model-based Multi-Agent Systems (LLM-based MAS), where multiple LLM agents collaborate to solve complex tasks, have shown impressive performance in many areas. However, MAS are typically distributed across different devices or environments, making them vulnerable to perturbations such as agent failures. While existing works have studied the adversarial attacks and corresponding defense strategies, they mainly focus on reactively detecting and mitigating attacks after they occur rather than proactively designing inherently resilient systems. In this work, we study the resilience of LLM-based MAS under perturbations and find that both the communication topology and prompt design significantly influence system resilience. Motivated by these findings, we propose ResMAS: a two-stage framework for enhancing MAS resilience. First, we train a reward model to predict the MAS’s resilience, based on which we train a topology generator to automatically design resilient topology for specific tasks through reinforcement learning. Second, we introduce a topology-aware prompt optimization method that refines each agent’s prompt based on its connections and interactions with other agents. Extensive experiments across a range of tasks show that our approach substantially improves MAS resilience under various constraints. Moreover, our framework demonstrates strong generalization ability to new tasks and models, highlighting its potential for building resilient MASs.

NeurIPS Conference 2025 Conference Paper

BridgePure: Limited Protection Leakage Can Break Black-Box Data Protection

  • Yihan Wang
  • Yiwei Lu
  • Xiao-Shan Gao
  • Gautam Kamath
  • Yaoliang Yu

Availability attacks, or unlearnable examples, are defensive techniques that allow data owners to modify their datasets in ways that prevent unauthorized machine learning models from learning effectively while maintaining the data's intended functionality. It has led to the release of popular black-box tools (e. g. , APIs) for users to upload personal data and receive protected counterparts. In this work, we show that such black-box protections can be substantially compromised if a small set of unprotected in-distribution data is available. Specifically, we propose a novel threat model of protection leakage, where an adversary can (1) easily acquire (unprotected, protected) pairs by querying the black-box protections with a small unprotected dataset; and (2) train a diffusion bridge model to build a mapping between unprotected and protected data. This mapping, termed BridgePure, can effectively remove the protection from any previously unseen data within the same distribution. BridgePure demonstrates superior purification performance on classification and style mimicry tasks, exposing critical vulnerabilities in black-box data protection. We suggest that practitioners implement multi-level countermeasures to mitigate such risks.

TMLR Journal 2025 Journal Article

MUC: Machine Unlearning for Contrastive Learning with Black-box Evaluation

  • Yihan Wang
  • Yiwei Lu
  • Guojun Zhang
  • Franziska Boenisch
  • Adam Dziedzic
  • Yaoliang Yu
  • Xiao-Shan Gao

Machine unlearning offers effective solutions for revoking the influence of specific training data on pre-trained model parameters. While existing approaches address unlearning for classification and generative models, they overlook an important category of machine learning models: contrastive learning (CL) methods. This paper addresses this gap by introducing the Machine Unlearning for Contrastive Learning (MUC) framework and adapting existing methods. We identify limitations in current approaches, noting that several methods perform inadequately as unlearners and that existing evaluation tools insufficiently validate unlearning effects in contrastive learning. To address these issues, we propose Alignment Calibration (AC), a novel method that explicitly considers contrastive learning properties and optimizes towards new auditing metrics for easy verification of unlearning. Through empirical comparisons with baseline methods on SimCLR, MoCo, and CLIP, we demonstrate that AC: (1) achieves state-of-the-art performance, approximating exact unlearning (retraining); (2) enables data owners to clearly visualize unlearning effects through black-box evaluation. The code is available at https://github.com/EhanW/Alignment-Calibration.

YNIMG Journal 2025 Journal Article

Neuroplastic differentiation in motor cortex subregions induced by basketball training: A multimodal diffusion MRI investigation

  • Wenshuang Tang
  • Yihan Wang
  • Yapeng Qi
  • Wenxuan Fang
  • Xinwei Li
  • Bowen Liu
  • Jilan Ning
  • Jiaxin Du

The primary motor cortex (M1) contains two functionally distinct subregions: effector subregions, responsible for fine motor control, and inter-effector subregions, involved in mind-body coordination and movement planning. However, the impact of long-term exercise training on subregion-specific microstructural plasticity in M1 remains unclear. In this study, thirty-four elite basketball athletes and thirty-five age- and gender-matched non-athletes were included in the analysis. All participants underwent T1-weighted imaging and diffusion MRI scanning. Probabilistic fiber tracking was employed to delineate distinct subregions within the M1. Diffusion MRI techniques, including diffusion tensor imaging, diffusion kurtosis imaging, and neurite orientation dispersion and density imaging, were employed to assess microstructural differences. The athletes' cognitive-motor integration performance was assessed by the swimmy paradigm. Compared to non-athletes, our results indicate that basketball athletes exhibited significantly decreased mean diffusivity (MD), axial diffusivity (AD), radial diffusivity (RD), orientation dispersion index (ODI), and free water fraction (FWF), and significantly increased axial kurtosis (AK) and neurite density index (NDI) in the inter-effector subregions of M1. No significant differences were observed in the effector-specific subregions. Correlation analyses revealed that the difference of reaction times was negatively correlated with MD\AD\RD and FWF, and positively correlated with NDI in the inter-effector subregions. These findings suggest that basketball training induces region-specific microstructural changes in M1, primarily in the inter-effector subregions, which are closely linked to cognitive-motor integration performance. The neuroplastic mechanisms induced by basketball training, as revealed in elite athletes, provide a rationale for exploring sport-based neuromodulatory interventions to optimize cognitive-motor rehabilitation.

NeurIPS Conference 2025 Conference Paper

On the Loss of Context Awareness in General Instruction Fine-tuning

  • Yihan Wang
  • Andrew Bai
  • Nanyun Peng
  • Cho-Jui Hsieh

Pre-trained Large Language Models (LLMs) require post-training methods such as supervised fine-tuning (SFT) on instruction-response pairs to enable instruction following. However, this process can cause forgetting in capabilities learned during pre-training. In this paper, we investigate the loss of context awareness after SFT, where context awareness is defined as the ability to extract and understand information from user-provided context. % and respond accordingly. Surprisingly, we discovered that the loss of context awareness occurs in instruction fine-tuned LLMs when the chat template is applied to input prompts. We identify that the performance decline is associated with a bias toward different roles learned during conversational instruction fine-tuning. The bias can be traced to training samples where the assistant response minimally relies on the user-provided instruction. Based on these observations, we propose a metric to identify context-dependent examples from general instruction fine-tuning datasets. We then apply conditional instruction fine-tuning with a context-dependency indicator, enabling the model to preserve context awareness after SFT. Experiments on four context-dependent downstream tasks and three pre-trained LLMs of different sizes show that our method effectively mitigates the loss of context awareness without compromising general instruction-following capabilities.

ICLR Conference 2025 Conference Paper

Provable Robust Overfitting Mitigation in Wasserstein Distributionally Robust Optimization

  • Shuang Liu
  • Yihan Wang
  • Yifan Zhu
  • Yibo Miao
  • Xiao-Shan Gao

Wasserstein distributionally robust optimization (WDRO) optimizes against worst-case distributional shifts within a specified uncertainty set, leading to enhanced generalization on unseen adversarial examples, compared to standard adversarial training which focuses on pointwise adversarial perturbations. However, WDRO still suffers fundamentally from the robust overfitting problem, as it does not consider statistical error. We address this gap by proposing a novel robust optimization framework under a new uncertainty set for adversarial noise via Wasserstein distance and statistical error via Kullback-Leibler divergence, called the Statistically Robust WDRO. We establish a robust generalization bound for the new optimization framework, implying that out-of-distribution adversarial performance is at least as good as the statistically robust training loss with high probability. Furthermore, we derive conditions under which Stackelberg and Nash equilibria exist between the learner and the adversary, giving an optimal robust model in certain sense.Finally, through extensive experiments, we demonstrate that our method significantly mitigates robust overfitting and enhances robustness within the framework of WDRO.

NeurIPS Conference 2025 Conference Paper

QiMeng-NeuComBack: Self-Evolving Translation from IR to Assembly Code

  • Hainan Fang
  • Yuanbo Wen
  • Jun Bi
  • Yihan Wang
  • Tonghui He
  • Yanlin Tang
  • Di Huang
  • Jiaming Guo

Compilers, while essential, are notoriously complex systems that demand prohibitively expensive human expertise to develop and maintain. The recent advancements in Large Language Models (LLMs) offer a compelling new paradigm: Neural Compilation, which could potentially simplify compiler development for new architectures and facilitate the discovery of innovative optimization techniques. However, several critical obstacles impede its practical adoption. Firstly, a significant lack of dedicated benchmarks and robust evaluation methodologies hinders objective assessment and tracking of progress in the field. Secondly, systematically enhancing the reliability and performance of LLM-generated assembly remains a critical challenge. Addressing these challenges, this paper introduces NeuComBack, a novel benchmark dataset specifically designed for IR-to-assembly compilation. Leveraging this dataset, we first define a foundational Neural Compilation workflow and conduct a comprehensive evaluation of the capabilities of recent frontier LLMs on Neural Compilation, establishing new performance baselines. We further propose a self-evolving prompt optimization method that enables LLMs to iteratively evolve their internal prompt strategies by extracting insights from prior self-debugging traces, thereby enhancing their neural compilation capabilities. Experiments demonstrate that our method significantly improves both the functional correctness and the performance of LLM-generated assembly code. Compared to baseline prompts, the functional correctness rates improved from 44% to 64% on x86 64 and from 36% to 58% on aarch64, respectively. More significantly, among the 16 correctly generated x86 64 programs using our method, 14 (87. 5%) surpassed clang-O3 performance. These consistent improvements across diverse architectures (x86_64 and aarch64) and program distributions (NeuComBack L1 and L2) validate our method's superiority over conventional approaches and its potential for broader adoption in low-level neural compilation.

NeurIPS Conference 2024 Conference Paper

Efficient Availability Attacks against Supervised and Contrastive Learning Simultaneously

  • Yihan Wang
  • Yifan Zhu
  • Xiao-Shan Gao

Availability attacks provide a tool to prevent the unauthorized use of private data and commercial datasets by generating imperceptible noise and crafting unlearnable examples before release. Ideally, the obtained unlearnability can prevent algorithms from training usable models. When supervised learning (SL) algorithms have failed, a malicious data collector possibly resorts to contrastive learning (CL) algorithms to bypass the protection. Through evaluation, we have found that most existing methods are unable to achieve both supervised and contrastive unlearnability, which poses risks to data protection by availability attacks. Different from recent methods based on contrastive learning, we employ contrastive-like data augmentations in supervised learning frameworks to obtain attacks effective for both SL and CL. Our proposed AUE and AAP attacks achieve state-of-the-art worst-case unlearnability across SL and CL algorithms with less computation consumption, showcasing prospects in real-world applications. The code is available at https: //github. com/EhanW/AUE-AAP.

AAAI Conference 2024 Conference Paper

Game-Theoretic Unlearnable Example Generator

  • Shuang Liu
  • Yihan Wang
  • Xiao-Shan Gao

Unlearnable example attacks are data poisoning attacks aiming to degrade the clean test accuracy of deep learning by adding imperceptible perturbations to the training samples, which can be formulated as a bi-level optimization problem. However, directly solving this optimization problem is intractable for deep neural networks. In this paper, we investigate unlearnable example attacks from a game-theoretic perspective, by formulating the attack as a nonzero sum Stackelberg game. First, the existence of game equilibria is proved under the normal setting and the adversarial training setting. It is shown that the game equilibrium gives the most powerful poison attack in that the victim has the lowest test accuracy among all networks within the same hypothesis space when certain loss functions are used. Second, we propose a novel attack method, called the Game Unlearnable Example (GUE), which has three main gradients. (1) The poisons are obtained by directly solving the equilibrium of the Stackelberg game with a first-order algorithm. (2) We employ an autoencoder-like generative network model as the poison attacker. (3) A novel payoff function is introduced to evaluate the performance of the poison. Comprehensive experiments demonstrate that GUE can effectively poison the model in various scenarios. Furthermore, the GUE still works by using a relatively small percentage of the training data to train the generator, and the poison generator can generalize to unseen data well. Our implementation code can be found at https://github.com/hong-xian/gue.

ICLR Conference 2024 Conference Paper

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

  • Yihan Wang
  • Si Si
  • Daliang Li
  • Michal Lukasik
  • Felix X. Yu
  • Cho-Jui Hsieh
  • Inderjit S. Dhillon
  • Sanjiv Kumar

Pretrained large language models (LLMs) are general purpose problem solvers applicable to a diverse set of tasks with prompts. They can be further improved towards a specific task by fine-tuning on a specialized dataset. However, fine-tuning usually makes the model narrowly specialized on this dataset with reduced general in-context learning performances, which is undesirable whenever the fine-tuned model needs to handle additional tasks where no fine-tuning data is available. In this work, we first demonstrate that fine-tuning on a single task indeed decreases LLMs' general in-context learning performance. We discover one important cause of such forgetting, format specialization, where the model overfits to the format of the fine-tuned task.We further show that format specialization happens at the very beginning of fine-tuning. To solve this problem, we propose Prompt Tuning with MOdel Tuning (ProMoT), a simple yet effective two-stage fine-tuning framework that reduces format specialization and improves generalization.ProMoT offloads task-specific format learning into additional and removable parameters by first doing prompt tuning and then fine-tuning the model itself with this soft prompt attached. With experiments on several fine-tuning tasks and 8 in-context evaluation tasks, we show that ProMoT achieves comparable performance on fine-tuned tasks to standard fine-tuning, but with much less loss of in-context learning performances across a board range of out-of-domain evaluation tasks. More importantly, ProMoT can even enhance generalization on in-context learning tasks that are semantically related to the fine-tuned task, e.g. ProMoT on En-Fr translation significantly improves performance on other language pairs, and ProMoT on NLI improves performance on summarization. Experiments also show that ProMoT can improve the generalization performance of multi-task training.

ICML Conference 2023 Conference Paper

Adversarial Parameter Attack on Deep Neural Networks

  • Lijia Yu
  • Yihan Wang
  • Xiao-Shan Gao

The parameter perturbation attack is a safety threat to deep learning, where small parameter perturbations are made such that the attacked network gives wrong or desired labels of the adversary to specified inputs. However, such attacks could be detected by the user, because the accuracy of the attacked network will reduce and the network cannot work normally. To make the attack more stealthy, in this paper, the adversarial parameter attack is proposed, in which small perturbations to the parameters of the network are made such that the accuracy of the attacked network does not decrease much, but its robustness against adversarial example attacks becomes much lower. As a consequence, the attacked network performs normally on standard samples, but is much more vulnerable to adversarial attacks. The existence of nearly perfect adversarial parameters under $L_\infty$ norm and $L_0$ norm is proved under reasonable conditions. Algorithms are given which can be used to produce high quality adversarial parameters for the commonly used networks trained with various robust training methods, in that the robustness of the attacked networks decreases significantly when they are evaluated using various adversarial attack methods.

NeurIPS Conference 2023 Conference Paper

Universality and Limitations of Prompt Tuning

  • Yihan Wang
  • Jatin Chauhan
  • Wei Wang
  • Cho-Jui Hsieh

Despite the demonstrated empirical efficacy of prompt tuning to adapt a pretrained language model for a new task, the theoretical underpinnings of the difference between "tuning parameters before the input" against "the tuning of model weights" are limited. We thus take one of the first steps to understand the role of soft-prompt tuning for transformer-based architectures. By considering a general purpose architecture, we analyze prompt tuning from the lens of both: universal approximation and limitations with finite-depth fixed-weight pretrained transformers for continuous-valued functions. Our universality result guarantees the existence of a strong transformer with a prompt to approximate any sequence-to-sequence function in the set of Lipschitz functions. The limitations of prompt tuning for limited-depth transformers are first proved by constructing a set of datasets, that cannot be memorized by a prompt of any length for a given single encoder layer. We also provide a lower bound on the required number of tunable prompt parameters and compare the result with the number of parameters required for a low-rank update (based on LoRA) for a single-layer setting. We finally extend our analysis to multi-layer settings by providing sufficient conditions under which the transformer can at best learn datasets from invertible functions only. Our theoretical claims are also corroborated by empirical results.

ICML Conference 2022 Conference Paper

A Branch and Bound Framework for Stronger Adversarial Attacks of ReLU Networks

  • Huan Zhang 0001
  • Shiqi Wang 0002
  • Kaidi Xu
  • Yihan Wang
  • Suman Jana
  • Cho-Jui Hsieh
  • J. Zico Kolter

Strong adversarial attacks are important for evaluating the true robustness of deep neural networks. Most existing attacks search in the input space, e. g. , using gradient descent, and may miss adversarial examples due to non-convexity. In this work, we systematically search adversarial examples in the activation space of ReLU networks to tackle hard instances where none of the existing adversarial attacks succeed. Unfortunately, searching the activation space typically relies on generic mixed integer programming (MIP) solvers and is limited to small networks and easy problem instances. To improve scalability and practicability, we use branch and bound (BaB) with specialized GPU-based bound propagation methods, and propose a top-down beam-search approach to quickly identify the subspace that may contain adversarial examples. Moreover, we build an adversarial candidates pool using cheap attacks to further assist the search in activation space via diving techniques and a bottom-up large neighborhood search. Our adversarial attack framework, BaB-Attack, opens up a new opportunity for designing novel adversarial attacks not limited to searching the input space, and enables us to borrow techniques from integer programming theory and neural network verification. In experiments, we can successfully generate adversarial examples when existing attacks on input space fail. Compared to off-the-shelf MIP solver based attacks that requires significant computations, we outperform in both success rates and efficiency.

NeurIPS Conference 2022 Conference Paper

Efficiently Computing Local Lipschitz Constants of Neural Networks via Bound Propagation

  • Zhouxing Shi
  • Yihan Wang
  • Huan Zhang
  • J. Zico Kolter
  • Cho-Jui Hsieh

Lipschitz constants are connected to many properties of neural networks, such as robustness, fairness, and generalization. Existing methods for computing Lipschitz constants either produce relatively loose upper bounds or are limited to small networks. In this paper, we develop an efficient framework for computing the $\ell_\infty$ local Lipschitz constant of a neural network by tightly upper bounding the norm of Clarke Jacobian via linear bound propagation. We formulate the computation of local Lipschitz constants with a linear bound propagation process on a high-order backward graph induced by the chain rule of Clarke Jacobian. To enable linear bound propagation, we derive tight linear relaxations for specific nonlinearities in Clarke Jacobian. This formulate unifies existing ad-hoc approaches such as RecurJac, which can be seen as a special case of ours with weaker relaxations. The bound propagation framework also allows us to easily borrow the popular Branch-and-Bound (BaB) approach from neural network verification to further tighten Lipschitz constants. Experiments show that on tiny models, our method produces comparable bounds compared to exact methods that cannot scale to slightly larger models; on larger models, our method efficiently produces tighter results than existing relaxed or naive methods, and our method scales to much larger practical models that previous works could not handle. We also demonstrate an application on provable monotonicity analysis. Code is available at https: //github. com/shizhouxing/Local-Lipschitz-Constants.

ICLR Conference 2022 Conference Paper

On the Convergence of Certified Robust Training with Interval Bound Propagation

  • Yihan Wang
  • Zhouxing Shi
  • Quanquan Gu
  • Cho-Jui Hsieh

Interval Bound Propagation (IBP) is so far the base of state-of-the-art methods for training neural networks with certifiable robustness guarantees when potential adversarial perturbations present, while the convergence of IBP training remains unknown in existing literature. In this paper, we present a theoretical analysis on the convergence of IBP training. With an overparameterized assumption, we analyze the convergence of IBP robust training. We show that when using IBP training to train a randomly initialized two-layer ReLU neural network with logistic loss, gradient descent can linearly converge to zero robust training error with a high probability if we have sufficiently small perturbation radius and large network width.

ICLR Conference 2021 Conference Paper

DOP: Off-Policy Multi-Agent Decomposed Policy Gradients

  • Yihan Wang
  • Beining Han
  • Tonghan Wang 0001
  • Heng Dong 0001
  • Chongjie Zhang

Multi-agent policy gradient (MAPG) methods recently witness vigorous progress. However, there is a significant performance discrepancy between MAPG methods and state-of-the-art multi-agent value-based approaches. In this paper, we investigate causes that hinder the performance of MAPG algorithms and present a multi-agent decomposed policy gradient method (DOP). This method introduces the idea of value function decomposition into the multi-agent actor-critic framework. Based on this idea, DOP supports efficient off-policy learning and addresses the issue of centralized-decentralized mismatch and credit assignment in both discrete and continuous action spaces. We formally show that DOP critics have sufficient representational capability to guarantee convergence. In addition, empirical evaluations on the StarCraft II micromanagement benchmark and multi-agent particle environments demonstrate that DOP outperforms both state-of-the-art value-based and policy-based multi-agent reinforcement learning algorithms. Demonstrative videos are available at https://sites.google.com/view/dop-mapg/.

ICLR Conference 2021 Conference Paper

Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers

  • Kaidi Xu
  • Huan Zhang 0001
  • Shiqi Wang 0002
  • Yihan Wang
  • Suman Jana
  • Xue Lin 0001
  • Cho-Jui Hsieh

Formal verification of neural networks (NNs) is a challenging and important problem. Existing efficient complete solvers typically require the branch-and-bound (BaB) process, which splits the problem domain into sub-domains and solves each sub-domain using faster but weaker incomplete verifiers, such as Linear Programming (LP) on linearly relaxed sub-domains. In this paper, we propose to use the backward mode linear relaxation based perturbation analysis (LiRPA) to replace LP during the BaB process, which can be efficiently implemented on the typical machine learning accelerators such as GPUs and TPUs. However, unlike LP, LiRPA when applied naively can produce much weaker bounds and even cannot check certain conflicts of sub-domains during splitting, making the entire procedure incomplete after BaB. To address these challenges, we apply a fast gradient based bound tightening procedure combined with batch splits and the design of minimal usage of LP bound procedure, enabling us to effectively use LiRPA on the accelerator hardware for the challenging complete NN verification problem and significantly outperform LP-based approaches. On a single GPU, we demonstrate an order of magnitude speedup compared to existing LP-based approaches.

NeurIPS Conference 2021 Conference Paper

Fast Certified Robust Training with Short Warmup

  • Zhouxing Shi
  • Yihan Wang
  • Huan Zhang
  • Jinfeng Yi
  • Cho-Jui Hsieh

Recently, bound propagation based certified robust training methods have been proposed for training neural networks with certifiable robustness guarantees. Despite that state-of-the-art (SOTA) methods including interval bound propagation (IBP) and CROWN-IBP have per-batch training complexity similar to standard neural network training, they usually use a long warmup schedule with hundreds or thousands epochs to reach SOTA performance and are thus still costly. In this paper, we identify two important issues in existing methods, namely exploded bounds at initialization, and the imbalance in ReLU activation states and improve IBP training. These two issues make certified training difficult and unstable, and thereby long warmup schedules were needed in prior works. To mitigate these issues and conduct faster certified training with shorter warmup, we propose three improvements based on IBP training: 1) We derive a new weight initialization method for IBP training; 2) We propose to fully add Batch Normalization (BN) to each layer in the model, since we find BN can reduce the imbalance in ReLU activation states; 3) We also design regularization to explicitly tighten certified bounds and balance ReLU activation states during wamrup. We are able to obtain 65. 03% verified error on CIFAR-10 ($\epsilon=\frac{8}{255}$) and 82. 36% verified error on TinyImageNet ($\epsilon=\frac{1}{255}$) using very short training schedules (160 and 80 total epochs, respectively), outperforming literature SOTA trained with hundreds or thousands epochs under the same network architecture. The code is available at https: //github. com/shizhouxing/Fast-Certified-Robust-Training.

NeurIPS Conference 2020 Conference Paper

Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond

  • Kaidi Xu
  • Zhouxing Shi
  • Huan Zhang
  • Yihan Wang
  • Kai-Wei Chang
  • Minlie Huang
  • Bhavya Kailkhura
  • Xue Lin

Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes provable linear bounds of output neurons given a certain amount of input perturbation, has become a core component in robustness verification and certified defense. The majority of LiRPA-based methods focus on simple feed-forward networks and need particular manual derivations and implementations when extended to other architectures. In this paper, we develop an automatic framework to enable perturbation analysis on any neural network structures, by generalizing existing LiRPA algorithms such as CROWN to operate on general computational graphs. The flexibility, differentiability and ease of use of our framework allow us to obtain state-of-the-art results on LiRPA based certified defense on fairly complicated networks like DenseNet, ResNeXt and Transformer that are not supported by prior works. Our framework also enables loss fusion, a technique that significantly reduces the computational complexity of LiRPA for certified defense. For the first time, we demonstrate LiRPA based certified defense on Tiny ImageNet and Downscaled ImageNet where previous approaches cannot scale to due to the relatively large number of classes. Our work also yields an open-source library for the community to apply LiRPA to areas beyond certified defense without much LiRPA expertise, e. g. , we create a neural network with a provably flat optimization landscape by applying LiRPA to network parameters. Our open source library is available at https: //github. com/KaidiXu/auto_LiRPA.

ICML Conference 2020 Conference Paper

On Lp-norm Robustness of Ensemble Decision Stumps and Trees

  • Yihan Wang
  • Huan Zhang 0001
  • Hongge Chen
  • Duane S. Boning
  • Cho-Jui Hsieh

Recent papers have demonstrated that ensemble stumps and trees could be vulnerable to small input perturbations, so robustness verification and defense for those models have become an important research problem. However, due to the structure of decision trees, where each node makes decision purely based on one feature value, all the previous works only consider the $\ell_\infty$ norm perturbation. To study robustness with respect to a general $\ell_p$ norm perturbation, one has to consider the correlation between perturbations on different features, which has not been handled by previous algorithms. In this paper, we study the problem of robustness verification and certified defense with respect to general $\ell_p$ norm perturbations for ensemble decision stumps and trees. For robustness verification of ensemble stumps, we prove that complete verification is NP-complete for $p\in(0, \infty)$ while polynomial time algorithms exist for $p=0$ or $\infty$. For $p\in(0, \infty)$ we develop an efficient dynamic programming based algorithm for sound verification of ensemble stumps. For ensemble trees, we generalize the previous multi-level robustness verification algorithm to $\ell_p$ norm. We demonstrate the first certified defense method for training ensemble stumps and trees with respect to $\ell_p$ norm perturbations, and verify its effectiveness empirically on real datasets.