Adversarial Weight Perturbation Helps Robust Generalization
- URL: http://arxiv.org/abs/2004.05884v2
- Date: Tue, 13 Oct 2020 13:46:09 GMT
- Title: Adversarial Weight Perturbation Helps Robust Generalization
- Authors: Dongxian Wu, Shu-tao Xia, Yisen Wang
- Abstract summary: Adversarial training is the most promising way to improve the robustness of deep neural networks against adversarial examples.
We show how the widely used weight loss landscape (loss change with respect to weight) performs in adversarial training.
We propose a simple yet effective Adversarial Weight Perturbation (AWP) to explicitly regularize the flatness of weight loss landscape.
- Score: 65.68598525492666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The study on improving the robustness of deep neural networks against
adversarial examples grows rapidly in recent years. Among them, adversarial
training is the most promising one, which flattens the input loss landscape
(loss change with respect to input) via training on adversarially perturbed
examples. However, how the widely used weight loss landscape (loss change with
respect to weight) performs in adversarial training is rarely explored. In this
paper, we investigate the weight loss landscape from a new perspective, and
identify a clear correlation between the flatness of weight loss landscape and
robust generalization gap. Several well-recognized adversarial training
improvements, such as early stopping, designing new objective functions, or
leveraging unlabeled data, all implicitly flatten the weight loss landscape.
Based on these observations, we propose a simple yet effective Adversarial
Weight Perturbation (AWP) to explicitly regularize the flatness of weight loss
landscape, forming a double-perturbation mechanism in the adversarial training
framework that adversarially perturbs both inputs and weights. Extensive
experiments demonstrate that AWP indeed brings flatter weight loss landscape
and can be easily incorporated into various existing adversarial training
methods to further boost their adversarial robustness.
Related papers
- Improving Fast Adversarial Training Paradigm: An Example Taxonomy Perspective [61.38753850236804]
Fast adversarial training (FAT) is presented for efficient training and has become a hot research topic.
FAT suffers from catastrophic overfitting, which leads to a performance drop compared with multi-step adversarial training.
We present an example taxonomy in FAT, which identifies that catastrophic overfitting is caused by the imbalance between the inner and outer optimization in FAT.
arXiv Detail & Related papers (2024-07-22T03:56:27Z) - Understanding and Combating Robust Overfitting via Input Loss Landscape
Analysis and Regularization [5.1024659285813785]
Adrial training is prone to overfitting, and the cause is far from clear.
We find that robust overfitting results from standard training, specifically the minimization of the clean loss.
We propose a new regularizer to smooth the loss landscape by penalizing the weighted logits variation along the adversarial direction.
arXiv Detail & Related papers (2022-12-09T16:55:30Z) - Boundary Adversarial Examples Against Adversarial Overfitting [4.391102490444538]
adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long.
Several mitigation approaches including early stopping, temporal ensembling and weight memorizations have been proposed to mitigate the effect of robust overfitting.
In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance.
arXiv Detail & Related papers (2022-11-25T13:16:53Z) - Robust Weight Perturbation for Adversarial Training [112.57295738113939]
Overfitting widely exists in adversarial robust training of deep networks.
Adversarial weight perturbation helps reduce the robust generalization gap.
A criterion that regulates the weight perturbation is crucial for adversarial training.
arXiv Detail & Related papers (2022-05-30T03:07:14Z) - Relating Adversarially Robust Generalization to Flat Minima [138.59125287276194]
Adversarial training (AT) has become the de-facto standard to obtain models robust against adversarial examples.
We study the relationship between robust generalization and flatness of the robust loss landscape in weight space.
arXiv Detail & Related papers (2021-04-09T15:55:01Z) - Adversarial Training Makes Weight Loss Landscape Sharper in Logistic
Regression [45.34758512755516]
Adrial training is actively studied for learning robust models against adversarial examples.
A recent study finds that adversarially trained models degenerate performance on adversarial examples when their weight loss landscape is sharp.
We theoretically analyze this phenomenon in this paper.
arXiv Detail & Related papers (2021-02-05T01:31:01Z) - Overfitting in adversarially robust deep learning [86.11788847990783]
We show that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training.
We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting.
arXiv Detail & Related papers (2020-02-26T15:40:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.