Removing Batch Normalization Boosts Adversarial Training
- URL: http://arxiv.org/abs/2207.01156v1
- Date: Mon, 4 Jul 2022 01:39:37 GMT
- Title: Removing Batch Normalization Boosts Adversarial Training
- Authors: Haotao Wang, Aston Zhang, Shuai Zheng, Xingjian Shi, Mu Li, Zhangyang
Wang
- Abstract summary: Adversarial training (AT) defends deep neural networks against adversarial attacks.
A major bottleneck is the widely used batch normalization (BN), which struggles to model the different statistics of clean and adversarial training samples in AT.
Our normalizer-free robust training (NoFrost) method extends recent advances in normalizer-free networks to AT.
- Score: 83.08844497295148
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training (AT) defends deep neural networks against adversarial
attacks. One challenge that limits its practical application is the performance
degradation on clean samples. A major bottleneck identified by previous works
is the widely used batch normalization (BN), which struggles to model the
different statistics of clean and adversarial training samples in AT. Although
the dominant approach is to extend BN to capture this mixture of distribution,
we propose to completely eliminate this bottleneck by removing all BN layers in
AT. Our normalizer-free robust training (NoFrost) method extends recent
advances in normalizer-free networks to AT for its unexplored advantage on
handling the mixture distribution challenge. We show that NoFrost achieves
adversarial robustness with only a minor sacrifice on clean sample accuracy. On
ImageNet with ResNet50, NoFrost achieves $74.06\%$ clean accuracy, which drops
merely $2.00\%$ from standard training. In contrast, BN-based AT obtains
$59.28\%$ clean accuracy, suffering a significant $16.78\%$ drop from standard
training. In addition, NoFrost achieves a $23.56\%$ adversarial robustness
against PGD attack, which improves the $13.57\%$ robustness in BN-based AT. We
observe better model smoothness and larger decision margins from NoFrost, which
make the models less sensitive to input perturbations and thus more robust.
Moreover, when incorporating more data augmentations into NoFrost, it achieves
comprehensive robustness against multiple distribution shifts. Code and
pre-trained models are public at
https://github.com/amazon-research/normalizer-free-robust-training.
Related papers
- Batch-in-Batch: a new adversarial training framework for initial perturbation and sample selection [9.241737058291823]
Adrial training methods generate independent initial perturbation for adversarial samples from a simple uniform distribution.
We propose a simple yet effective training framework called Batch-in-Batch to enhance models.
We show that models trained within the BB framework consistently have higher adversarial accuracy across various adversarial settings.
arXiv Detail & Related papers (2024-06-06T13:34:43Z) - Adaptive Batch Normalization Networks for Adversarial Robustness [33.14617293166724]
Adversarial Training (AT) has been a standard foundation of modern adversarial defense approaches.
We propose adaptive Batch Normalization Network (ABNN), inspired by the recent advances in test-time domain adaptation.
ABNN consistently improves adversarial robustness against both digital and physically realizable attacks.
arXiv Detail & Related papers (2024-05-20T00:58:53Z) - RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness [4.188296977882316]
We propose a novel training framework textbfRAMP, to boost the robustness against multiple $l_p$ perturbations.
For training from scratch, textbfRAMP achieves a union accuracy of $44.6%$ and good clean accuracy of $81.2%$ on ResNet-18 against AutoAttack on CIFAR-10.
arXiv Detail & Related papers (2024-02-09T23:29:54Z) - ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models [59.90959789767886]
We show that optimizing consistency training loss minimizes the Wasserstein distance between target and generated distributions.
By incorporating a discriminator into the consistency training framework, our method achieves improved FID scores on CIFAR10 and ImageNet 64$times$64 and LSUN Cat 256$times$256 datasets.
arXiv Detail & Related papers (2023-11-23T16:49:06Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Sparsity Winning Twice: Better Robust Generalization from More Efficient
Training [94.92954973680914]
We introduce two alternatives for sparse adversarial training: (i) static sparsity and (ii) dynamic sparsity.
We find both methods to yield win-win: substantially shrinking the robust generalization gap and alleviating the robust overfitting.
Our approaches can be combined with existing regularizers, establishing new state-of-the-art results in adversarial training.
arXiv Detail & Related papers (2022-02-20T15:52:08Z) - "BNN - BN = ?": Training Binary Neural Networks without Batch
Normalization [92.23297927690149]
Batch normalization (BN) is a key facilitator and considered essential for state-of-the-art binary neural networks (BNN)
We extend their framework to training BNNs, and for the first time demonstrate that BNs can be completed removed from BNN training and inference regimes.
arXiv Detail & Related papers (2021-04-16T16:46:57Z) - Fast is better than free: Revisiting adversarial training [86.11788847990783]
We show that it is possible to train empirically robust models using a much weaker and cheaper adversary.
We identify a failure mode referred to as "catastrophic overfitting" which may have caused previous attempts to use FGSM adversarial training to fail.
arXiv Detail & Related papers (2020-01-12T20:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.