Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness

Olukorede Fakorede; Modeste Atsague; Jin Tian

Back to TMLR

TMLR 2024

Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness

Journal Article Articles Artificial Intelligence · Machine Learning

PDF Details

Abstract

Adversarial Training (AT ) has been demonstrated to improve the robustness of deep neural networks (DNNs) to adversarial attacks. AT is a min-max optimization procedure wherein adversarial examples are generated to train a robust DNN. The inner maximization step of AT maximizes the losses of inputs w.r.t their actual classes. The outer minimization involves minimizing the losses on the adversarial examples obtained from the inner maximization. This work proposes a standard-deviation-inspired (SDI ) regularization term for improving adversarial robustness and generalization. We argue that the inner maximization is akin to minimizing a modified standard deviation of a model’s output probabilities. Moreover, we argue that maximizing the modified standard deviation measure may complement the outer minimization of the AT framework. To corroborate our argument, we experimentally show that the SDI measure may be utilized to craft adversarial examples. Furthermore, we show that combining the proposed SDI regularization term with existing AT variants improves the robustness of DNNs to stronger attacks (e.g., CW and Auto-attack) and improves robust generalization.

Standard-Deviation-Inspired Regularization for Improving Adversarial Robustness

Abstract

Authors

Keywords

Context