On the Impact of Hard Adversarial Instances on Overfitting in
Adversarial Training
- URL: http://arxiv.org/abs/2112.07324v1
- Date: Tue, 14 Dec 2021 12:19:24 GMT
- Title: On the Impact of Hard Adversarial Instances on Overfitting in
Adversarial Training
- Authors: Chen Liu, Zhichao Huang, Mathieu Salzmann, Tong Zhang, Sabine
S\"usstrunk
- Abstract summary: Adversarial training is a popular method to robustify models against adversarial attacks.
We investigate this phenomenon from the perspective of training instances.
We show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances.
- Score: 72.95029777394186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training is a popular method to robustify models against
adversarial attacks. However, it exhibits much more severe overfitting than
training on clean inputs. In this work, we investigate this phenomenon from the
perspective of training instances, i.e., training input-target pairs. Based on
a quantitative metric measuring instances' difficulty, we analyze the model's
behavior on training instances of different difficulty levels. This lets us
show that the decay in generalization performance of adversarial training is a
result of the model's attempt to fit hard adversarial instances. We
theoretically verify our observations for both linear and general nonlinear
models, proving that models trained on hard instances have worse generalization
performance than ones trained on easy instances. Furthermore, we prove that the
difference in the generalization gap between models trained by instances of
different difficulty levels increases with the size of the adversarial budget.
Finally, we conduct case studies on methods mitigating adversarial overfitting
in several scenarios. Our analysis shows that methods successfully mitigating
adversarial overfitting all avoid fitting hard adversarial instances, while
ones fitting hard adversarial instances do not achieve true robustness.
Related papers
- Vulnerability-Aware Instance Reweighting For Adversarial Training [4.874780144224057]
Adversarial Training (AT) has been found to substantially improve the robustness of deep learning classifiers against adversarial attacks.
AT exerts an uneven influence on different classes in a training set and unfairly hurts examples corresponding to classes that are inherently harder to classify.
Various reweighting schemes have been proposed that assign unequal weights to robust losses of individual examples in a training set.
In this work, we propose a novel instance-wise reweighting scheme. It considers the vulnerability of each natural example and the resulting information loss on its adversarial counterpart occasioned by adversarial attacks.
arXiv Detail & Related papers (2023-07-14T05:31:32Z) - A3T: Accuracy Aware Adversarial Training [22.42867682734154]
We identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples.
We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods.
arXiv Detail & Related papers (2022-11-29T15:56:43Z) - The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for
Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples.
We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart.
Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Enhancing Adversarial Robustness for Deep Metric Learning [77.75152218980605]
adversarial robustness of deep metric learning models has to be improved.
In order to avoid model collapse due to excessively hard examples, the existing defenses dismiss the min-max adversarial training.
We propose Hardness Manipulation to efficiently perturb the training triplet till a specified level of hardness for adversarial training.
arXiv Detail & Related papers (2022-03-02T22:27:44Z) - Calibrated Adversarial Training [8.608288231153304]
We present the Calibrated Adversarial Training, a method that reduces the adverse effects of semantic perturbations in adversarial training.
The method produces pixel-level adaptations to the perturbations based on novel calibrated robust error.
arXiv Detail & Related papers (2021-10-01T19:17:28Z) - Multi-stage Optimization based Adversarial Training [16.295921205749934]
We propose a Multi-stage Optimization based Adversarial Training (MOAT) method that periodically trains the model on mixed benign examples.
Under similar amount of training overhead, the proposed MOAT exhibits better robustness than either single-step or multi-step adversarial training methods.
arXiv Detail & Related papers (2021-06-26T07:59:52Z) - Single-step Adversarial training with Dropout Scheduling [59.50324605982158]
We show that models trained using single-step adversarial training method learn to prevent the generation of single-step adversaries.
Models trained using proposed single-step adversarial training method are robust against both single-step and multi-step adversarial attacks.
arXiv Detail & Related papers (2020-04-18T14:14:00Z) - Regularizers for Single-step Adversarial Training [49.65499307547198]
We propose three types of regularizers that help to learn robust models using single-step adversarial training methods.
Regularizers mitigate the effect of gradient masking by harnessing on properties that differentiate a robust model from that of a pseudo robust model.
arXiv Detail & Related papers (2020-02-03T09:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.