Using Single-Step Adversarial Training to Defend Iterative Adversarial
Examples
- URL: http://arxiv.org/abs/2002.09632v2
- Date: Thu, 27 Feb 2020 17:24:24 GMT
- Title: Using Single-Step Adversarial Training to Defend Iterative Adversarial
Examples
- Authors: Guanxiong Liu, Issa Khalil, Abdallah Khreishah
- Abstract summary: We propose a novel single-step adversarial training method which can defend against both single-step and iterative adversarial examples.
Our proposed method achieves 35.67% enhancement in test accuracy and 19.14% reduction in training time.
- Score: 6.609200722223488
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial examples have become one of the largest challenges that machine
learning models, especially neural network classifiers, face. These adversarial
examples break the assumption of attack-free scenario and fool state-of-the-art
(SOTA) classifiers with insignificant perturbations to human. So far,
researchers achieved great progress in utilizing adversarial training as a
defense. However, the overwhelming computational cost degrades its
applicability and little has been done to overcome this issue. Single-Step
adversarial training methods have been proposed as computationally viable
solutions, however they still fail to defend against iterative adversarial
examples. In this work, we first experimentally analyze several different SOTA
defense methods against adversarial examples. Then, based on observations from
experiments, we propose a novel single-step adversarial training method which
can defend against both single-step and iterative adversarial examples. Lastly,
through extensive evaluations, we demonstrate that our proposed method
outperforms the SOTA single-step and iterative adversarial training defense.
Compared with ATDA (single-step method) on CIFAR10 dataset, our proposed method
achieves 35.67% enhancement in test accuracy and 19.14% reduction in training
time. When compared with methods that use BIM or Madry examples (iterative
methods) on CIFAR10 dataset, it saves up to 76.03% in training time with less
than 3.78% degeneration in test accuracy.
Related papers
- Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled.
Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z) - Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - A Data-Centric Approach for Improving Adversarial Training Through the
Lens of Out-of-Distribution Detection [0.4893345190925178]
We propose detecting and removing hard samples directly from the training procedure rather than applying complicated algorithms to mitigate their effects.
Our results on SVHN and CIFAR-10 datasets show the effectiveness of this method in improving the adversarial training without adding too much computational cost.
arXiv Detail & Related papers (2023-01-25T08:13:50Z) - The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for
Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples.
We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart.
Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z) - Two Heads are Better than One: Robust Learning Meets Multi-branch Models [14.72099568017039]
We propose Branch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.
We evaluate our approach on CIFAR-10, CIFAR-100, and SVHN against ell_infty norm-bounded perturbations of size epsilon = 8/255, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z) - Efficient Adversarial Training With Data Pruning [26.842714298874192]
We show that data pruning leads to improvements in convergence and reliability of adversarial training.
In some settings data pruning brings benefits from both worlds-it both improves adversarial accuracy and training time.
arXiv Detail & Related papers (2022-07-01T23:54:46Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Bag of Tricks for Adversarial Training [50.53525358778331]
Adrial training is one of the most effective strategies for promoting model robustness.
Recent benchmarks show that most of the proposed improvements on AT are less effective than simply early stopping the training procedure.
arXiv Detail & Related papers (2020-10-01T15:03:51Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.