Uncovering the Limits of Adversarial Training against Norm-Bounded
Adversarial Examples
- URL: http://arxiv.org/abs/2010.03593v3
- Date: Tue, 30 Mar 2021 08:08:12 GMT
- Title: Uncovering the Limits of Adversarial Training against Norm-Bounded
Adversarial Examples
- Authors: Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli
- Abstract summary: We study the effect of different training losses, model sizes, activation functions, the addition of unlabeled data (through pseudo-labeling) and other factors on adversarial robustness.
We discover that it is possible to train robust models that go well beyond state-of-the-art results by combining larger models, Swish/SiLU activations and model weight averaging.
- Score: 47.27255244183513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training and its variants have become de facto standards for
learning robust deep neural networks. In this paper, we explore the landscape
around adversarial training in a bid to uncover its limits. We systematically
study the effect of different training losses, model sizes, activation
functions, the addition of unlabeled data (through pseudo-labeling) and other
factors on adversarial robustness. We discover that it is possible to train
robust models that go well beyond state-of-the-art results by combining larger
models, Swish/SiLU activations and model weight averaging. We demonstrate large
improvements on CIFAR-10 and CIFAR-100 against $\ell_\infty$ and $\ell_2$
norm-bounded perturbations of size $8/255$ and $128/255$, respectively. In the
setting with additional unlabeled data, we obtain an accuracy under attack of
65.88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6.35%
with respect to prior art). Without additional data, we obtain an accuracy
under attack of 57.20% (+3.46%). To test the generality of our findings and
without any additional modifications, we obtain an accuracy under attack of
80.53% (+7.62%) against $\ell_2$ perturbations of size $128/255$ on CIFAR-10,
and of 36.88% (+8.46%) against $\ell_\infty$ perturbations of size $8/255$ on
CIFAR-100. All models are available at
https://github.com/deepmind/deepmind-research/tree/master/adversarial_robustness.
Related papers
- Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies [19.100334346597982]
We analyze how model size, dataset size, and synthetic data quality affect robustness by developing the first scaling laws for adversarial training.
Our scaling laws reveal inefficiencies in prior art and provide actionable feedback to advance the field.
arXiv Detail & Related papers (2024-04-14T20:14:38Z) - Better Diffusion Models Further Improve Adversarial Training [97.44991845907708]
It has been recognized that the data generated by the diffusion probabilistic model (DDPM) improves adversarial training.
This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency.
Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data.
arXiv Detail & Related papers (2023-02-09T13:46:42Z) - Removing Batch Normalization Boosts Adversarial Training [83.08844497295148]
Adversarial training (AT) defends deep neural networks against adversarial attacks.
A major bottleneck is the widely used batch normalization (BN), which struggles to model the different statistics of clean and adversarial training samples in AT.
Our normalizer-free robust training (NoFrost) method extends recent advances in normalizer-free networks to AT.
arXiv Detail & Related papers (2022-07-04T01:39:37Z) - Data Augmentation Can Improve Robustness [21.485435979018256]
Adrial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training.
We demonstrate that, when combined with model weight averaging, data augmentation can significantly boost robust accuracy.
In particular, against $ell_infty$ norm-bounded perturbations of size $epsilon = 8/255$, our model reaches 60.07% robust accuracy without using any external data.
arXiv Detail & Related papers (2021-11-09T18:57:00Z) - Improving Robustness using Generated Data [20.873767830152605]
generative models trained solely on the original training set can be leveraged to artificially increase the size of the original training set.
We show large absolute improvements in robust accuracy compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-10-18T17:00:26Z) - Robustifying $\ell_\infty$ Adversarial Training to the Union of
Perturbation Models [120.71277007016708]
We extend the capabilities of widely popular single-attack $ell_infty$ AT frameworks.
Our technique, referred to as Noise Augmented Processing (SNAP), exploits a well-established byproduct of single-attack AT frameworks.
SNAP prepends a given deep net with a shaped noise augmentation layer whose distribution is learned along with network parameters using any standard single-attack AT.
arXiv Detail & Related papers (2021-05-31T05:18:42Z) - Adversarial robustness against multiple $l_p$-threat models at the price
of one and how to quickly fine-tune robust models to another threat model [79.05253587566197]
Adrial training (AT) in order to achieve adversarial robustness wrt single $l_p$-threat models has been discussed extensively.
In this paper we develop a simple and efficient training scheme to achieve adversarial robustness against the union of $l_p$-threat models.
arXiv Detail & Related papers (2021-05-26T12:20:47Z) - Fixing Data Augmentation to Improve Adversarial Robustness [21.485435979018256]
Adversarial training suffers from robust overfitting, a phenomenon where the robust test accuracy starts to decrease during training.
In this paper, we focus on both adversarials-driven and data-driven augmentations as a means to reduce robust overfitting.
We show large absolute improvements of +7.06% and +5.88% in robust accuracy compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-02T18:58:33Z) - To be Robust or to be Fair: Towards Fairness in Adversarial Training [83.42241071662897]
We find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data.
We propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses.
arXiv Detail & Related papers (2020-10-13T02:21:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.