Robust Weight Perturbation for Adversarial Training
- URL: http://arxiv.org/abs/2205.14826v1
- Date: Mon, 30 May 2022 03:07:14 GMT
- Title: Robust Weight Perturbation for Adversarial Training
- Authors: Chaojian Yu, Bo Han, Mingming Gong, Li Shen, Shiming Ge, Bo Du,
Tongliang Liu
- Abstract summary: Overfitting widely exists in adversarial robust training of deep networks.
Adversarial weight perturbation helps reduce the robust generalization gap.
A criterion that regulates the weight perturbation is crucial for adversarial training.
- Score: 112.57295738113939
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Overfitting widely exists in adversarial robust training of deep networks. An
effective remedy is adversarial weight perturbation, which injects the
worst-case weight perturbation during network training by maximizing the
classification loss on adversarial examples. Adversarial weight perturbation
helps reduce the robust generalization gap; however, it also undermines the
robustness improvement. A criterion that regulates the weight perturbation is
therefore crucial for adversarial training. In this paper, we propose such a
criterion, namely Loss Stationary Condition (LSC) for constrained perturbation.
With LSC, we find that it is essential to conduct weight perturbation on
adversarial data with small classification loss to eliminate robust
overfitting. Weight perturbation on adversarial data with large classification
loss is not necessary and may even lead to poor robustness. Based on these
observations, we propose a robust perturbation strategy to constrain the extent
of weight perturbation. The perturbation strategy prevents deep networks from
overfitting while avoiding the side effect of excessive weight perturbation,
significantly improving the robustness of adversarial training. Extensive
experiments demonstrate the superiority of the proposed method over the
state-of-the-art adversarial training methods.
Related papers
- Fast Adversarial Training with Smooth Convergence [51.996943482875366]
We analyze the training process of prior Fast adversarial training (FAT) work and observe that catastrophic overfitting is accompanied by the appearance of loss convergence outliers.
To obtain a smooth loss convergence process, we propose a novel oscillatory constraint (dubbed ConvergeSmooth) to limit the loss difference between adjacent epochs.
Our proposed methods are attack-agnostic and thus can improve the training stability of various FAT techniques.
arXiv Detail & Related papers (2023-08-24T15:28:52Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Understanding and Combating Robust Overfitting via Input Loss Landscape
Analysis and Regularization [5.1024659285813785]
Adrial training is prone to overfitting, and the cause is far from clear.
We find that robust overfitting results from standard training, specifically the minimization of the clean loss.
We propose a new regularizer to smooth the loss landscape by penalizing the weighted logits variation along the adversarial direction.
arXiv Detail & Related papers (2022-12-09T16:55:30Z) - Boundary Adversarial Examples Against Adversarial Overfitting [4.391102490444538]
adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long.
Several mitigation approaches including early stopping, temporal ensembling and weight memorizations have been proposed to mitigate the effect of robust overfitting.
In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance.
arXiv Detail & Related papers (2022-11-25T13:16:53Z) - Strength-Adaptive Adversarial Training [103.28849734224235]
Adversarial training (AT) is proven to reliably improve network's robustness against adversarial data.
Current AT with a pre-specified perturbation budget has limitations in learning a robust network.
We propose emphStrength-Adaptive Adversarial Training (SAAT) to overcome these limitations.
arXiv Detail & Related papers (2022-10-04T00:22:37Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Adversarial Weight Perturbation Helps Robust Generalization [65.68598525492666]
Adversarial training is the most promising way to improve the robustness of deep neural networks against adversarial examples.
We show how the widely used weight loss landscape (loss change with respect to weight) performs in adversarial training.
We propose a simple yet effective Adversarial Weight Perturbation (AWP) to explicitly regularize the flatness of weight loss landscape.
arXiv Detail & Related papers (2020-04-13T12:05:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.