Related papers: On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training

On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training

URL: http://arxiv.org/abs/2112.07324v1
Date: Tue, 14 Dec 2021 12:19:24 GMT
Title: On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training
Authors: Chen Liu, Zhichao Huang, Mathieu Salzmann, Tong Zhang, Sabine S\"usstrunk
Abstract summary: Adversarial training is a popular method to robustify models against adversarial attacks. We investigate this phenomenon from the perspective of training instances. We show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances.
Score: 72.95029777394186
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training is a popular method to robustify models against adversarial attacks. However, it exhibits much more severe overfitting than training on clean inputs. In this work, we investigate this phenomenon from the perspective of training instances, i.e., training input-target pairs. Based on a quantitative metric measuring instances' difficulty, we analyze the model's behavior on training instances of different difficulty levels. This lets us show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances. We theoretically verify our observations for both linear and general nonlinear models, proving that models trained on hard instances have worse generalization performance than ones trained on easy instances. Furthermore, we prove that the difference in the generalization gap between models trained by instances of different difficulty levels increases with the size of the adversarial budget. Finally, we conduct case studies on methods mitigating adversarial overfitting in several scenarios. Our analysis shows that methods successfully mitigating adversarial overfitting all avoid fitting hard adversarial instances, while ones fitting hard adversarial instances do not achieve true robustness.

Related papers

Fast Propagation is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples. We propose to exploit the interior building blocks of the model to improve efficiency. Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z)
Vulnerability-Aware Instance Reweighting For Adversarial Training [4.874780144224057]
Adversarial Training (AT) has been found to substantially improve the robustness of deep learning classifiers against adversarial attacks. AT exerts an uneven influence on different classes in a training set and unfairly hurts examples corresponding to classes that are inherently harder to classify. Various reweighting schemes have been proposed that assign unequal weights to robust losses of individual examples in a training set. In this work, we propose a novel instance-wise reweighting scheme. It considers the vulnerability of each natural example and the resulting information loss on its adversarial counterpart occasioned by adversarial attacks.
arXiv Detail & Related papers (2023-07-14T05:31:32Z)
A3T: Accuracy Aware Adversarial Training [22.42867682734154]
We identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples. We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods.
arXiv Detail & Related papers (2022-11-29T15:56:43Z)
Impact of Adversarial Training on Robustness and Generalizability of Language Models [33.790145748360686]
This work provides an in depth comparison of different approaches for adversarial training in language models. Our findings suggest that better robustness can be achieved by pre-training data augmentation or by training with input space perturbation. A linguistic correlation analysis of neurons of the learned models reveals that the improved generalization is due to'more specialized' neurons.
arXiv Detail & Related papers (2022-11-10T12:36:50Z)
The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples. We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart. Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z)
Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training. We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z)
Enhancing Adversarial Robustness for Deep Metric Learning [77.75152218980605]
adversarial robustness of deep metric learning models has to be improved. In order to avoid model collapse due to excessively hard examples, the existing defenses dismiss the min-max adversarial training. We propose Hardness Manipulation to efficiently perturb the training triplet till a specified level of hardness for adversarial training.
arXiv Detail & Related papers (2022-03-02T22:27:44Z)
Calibrated Adversarial Training [8.608288231153304]
We present the Calibrated Adversarial Training, a method that reduces the adverse effects of semantic perturbations in adversarial training. The method produces pixel-level adaptations to the perturbations based on novel calibrated robust error.
arXiv Detail & Related papers (2021-10-01T19:17:28Z)
Imbalanced Adversarial Training with Reweighting [33.51820466479575]
We show that adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is imbalanced. Traditional reweighting strategies may lose efficacy to deal with the imbalance issue for adversarial training. We propose Separable Reweighted Adversarial Training (SRAT) to facilitate adversarial training under imbalanced scenarios.
arXiv Detail & Related papers (2021-07-28T20:51:36Z)
Multi-stage Optimization based Adversarial Training [16.295921205749934]
We propose a Multi-stage Optimization based Adversarial Training (MOAT) method that periodically trains the model on mixed benign examples. Under similar amount of training overhead, the proposed MOAT exhibits better robustness than either single-step or multi-step adversarial training methods.
arXiv Detail & Related papers (2021-06-26T07:59:52Z)
Single-step Adversarial training with Dropout Scheduling [59.50324605982158]
We show that models trained using single-step adversarial training method learn to prevent the generation of single-step adversaries. Models trained using proposed single-step adversarial training method are robust against both single-step and multi-step adversarial attacks.
arXiv Detail & Related papers (2020-04-18T14:14:00Z)
Overfitting in adversarially robust deep learning [86.11788847990783]
We show that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training. We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting.
arXiv Detail & Related papers (2020-02-26T15:40:50Z)
Regularizers for Single-step Adversarial Training [49.65499307547198]
We propose three types of regularizers that help to learn robust models using single-step adversarial training methods. Regularizers mitigate the effect of gradient masking by harnessing on properties that differentiate a robust model from that of a pseudo robust model.
arXiv Detail & Related papers (2020-02-03T09:21:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.