Regularizers for Single-step Adversarial Training
- URL: http://arxiv.org/abs/2002.00614v1
- Date: Mon, 3 Feb 2020 09:21:04 GMT
- Title: Regularizers for Single-step Adversarial Training
- Authors: B.S. Vivek, R. Venkatesh Babu
- Abstract summary: We propose three types of regularizers that help to learn robust models using single-step adversarial training methods.
Regularizers mitigate the effect of gradient masking by harnessing on properties that differentiate a robust model from that of a pseudo robust model.
- Score: 49.65499307547198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The progress in the last decade has enabled machine learning models to
achieve impressive performance across a wide range of tasks in Computer Vision.
However, a plethora of works have demonstrated the susceptibility of these
models to adversarial samples. Adversarial training procedure has been proposed
to defend against such adversarial attacks. Adversarial training methods
augment mini-batches with adversarial samples, and typically single-step
(non-iterative) methods are used for generating these adversarial samples.
However, models trained using single-step adversarial training converge to
degenerative minima where the model merely appears to be robust. The pseudo
robustness of these models is due to the gradient masking effect. Although
multi-step adversarial training helps to learn robust models, they are hard to
scale due to the use of iterative methods for generating adversarial samples.
To address these issues, we propose three different types of regularizers that
help to learn robust models using single-step adversarial training methods. The
proposed regularizers mitigate the effect of gradient masking by harnessing on
properties that differentiate a robust model from that of a pseudo robust
model. Performance of models trained using the proposed regularizers is on par
with models trained using computationally expensive multi-step adversarial
training methods.
Related papers
- Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for
Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples.
We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart.
Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z) - Enhancing Targeted Attack Transferability via Diversified Weight Pruning [0.3222802562733786]
Malicious attackers can generate targeted adversarial examples by imposing human-imperceptible noise on images.
With cross-model transferable adversarial examples, the vulnerability of neural networks remains even if the model information is kept secret from the attacker.
Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples.
arXiv Detail & Related papers (2022-08-18T07:25:48Z) - Adversarial Fine-tune with Dynamically Regulated Adversary [27.034257769448914]
In many real-world applications such as health diagnosis and autonomous surgical robotics, the standard performance is more valued over model robustness against such extremely malicious attacks.
This work proposes a simple yet effective transfer learning-based adversarial training strategy that disentangles the negative effects of adversarial samples on model's standard performance.
In addition, we introduce a training-friendly adversarial attack algorithm, which facilitates the boost of adversarial robustness without introducing significant training complexity.
arXiv Detail & Related papers (2022-04-28T00:07:15Z) - Improving Non-autoregressive Generation with Mixup Training [51.61038444990301]
We present a non-autoregressive generation model based on pre-trained transformer models.
We propose a simple and effective iterative training method called MIx Source and pseudo Target.
Our experiments on three generation benchmarks including question generation, summarization and paraphrase generation, show that the proposed framework achieves the new state-of-the-art results.
arXiv Detail & Related papers (2021-10-21T13:04:21Z) - Robustness and Generalization via Generative Adversarial Training [21.946687274313177]
We present Generative Adversarial Training, an approach to simultaneously improve the model's generalization to the test set and out-of-domain samples.
We show that our approach not only improves performance of the model on clean images and out-of-domain samples but also makes it robust against unforeseen attacks.
arXiv Detail & Related papers (2021-09-06T22:34:04Z) - Training Meta-Surrogate Model for Transferable Adversarial Attack [98.13178217557193]
We consider adversarial attacks to a black-box model when no queries are allowed.
In this setting, many methods directly attack surrogate models and transfer the obtained adversarial examples to fool the target model.
We show we can obtain a Meta-Surrogate Model (MSM) such that attacks to this model can be easier transferred to other models.
arXiv Detail & Related papers (2021-09-05T03:27:46Z) - Multi-stage Optimization based Adversarial Training [16.295921205749934]
We propose a Multi-stage Optimization based Adversarial Training (MOAT) method that periodically trains the model on mixed benign examples.
Under similar amount of training overhead, the proposed MOAT exhibits better robustness than either single-step or multi-step adversarial training methods.
arXiv Detail & Related papers (2021-06-26T07:59:52Z) - Single-step Adversarial training with Dropout Scheduling [59.50324605982158]
We show that models trained using single-step adversarial training method learn to prevent the generation of single-step adversaries.
Models trained using proposed single-step adversarial training method are robust against both single-step and multi-step adversarial attacks.
arXiv Detail & Related papers (2020-04-18T14:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.