DropAttack: A Masked Weight Adversarial Training Method to Improve
Generalization of Neural Networks
- URL: http://arxiv.org/abs/2108.12805v1
- Date: Sun, 29 Aug 2021 10:09:43 GMT
- Title: DropAttack: A Masked Weight Adversarial Training Method to Improve
Generalization of Neural Networks
- Authors: Shiwen Ni, Jiawen Li and Hung-Yu Kao
- Abstract summary: We propose a novel masked weight adversarial training method called DropAttack.
DropAttack enhances generalization of model by adding intentionally worst-case adversarial perturbations to both the input and hidden layers.
We compare the proposed method with other adversarial training methods and regularization methods, and our method achieves state-of-the-art on all datasets.
- Score: 7.519872646378836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training has been proven to be a powerful regularization method
to improve the generalization of models. However, current adversarial training
methods only attack the original input sample or the embedding vectors, and
their attacks lack coverage and diversity. To further enhance the breadth and
depth of attack, we propose a novel masked weight adversarial training method
called DropAttack, which enhances generalization of model by adding
intentionally worst-case adversarial perturbations to both the input and hidden
layers in different dimensions and minimize the adversarial risks generated by
each layer. DropAttack is a general technique and can be adopt to a wide
variety of neural networks with different architectures. To validate the
effectiveness of the proposed method, we used five public datasets in the
fields of natural language processing (NLP) and computer vision (CV) for
experimental evaluating. We compare the proposed method with other adversarial
training methods and regularization methods, and our method achieves
state-of-the-art on all datasets. In addition, Dropattack can achieve the same
performance when it use only a half training data compared to other standard
training method. Theoretical analysis reveals that DropAttack can perform
gradient regularization at random on some of the input and wight parameters of
the model. Further visualization experiments show that DropAttack can push the
minimum risk of the model to a lower and flatter loss landscapes. Our source
code is publicly available on https://github.com/nishiwen1214/DropAttack.
Related papers
- Fast Propagation is Better: Accelerating Single-Step Adversarial
Training via Sampling Subnetworks [69.54774045493227]
A drawback of adversarial training is the computational overhead introduced by the generation of adversarial examples.
We propose to exploit the interior building blocks of the model to improve efficiency.
Compared with previous methods, our method not only reduces the training cost but also achieves better model robustness.
arXiv Detail & Related papers (2023-10-24T01:36:20Z) - Enhancing Targeted Attack Transferability via Diversified Weight Pruning [0.3222802562733786]
Malicious attackers can generate targeted adversarial examples by imposing human-imperceptible noise on images.
With cross-model transferable adversarial examples, the vulnerability of neural networks remains even if the model information is kept secret from the attacker.
Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples.
arXiv Detail & Related papers (2022-08-18T07:25:48Z) - One-shot Neural Backdoor Erasing via Adversarial Weight Masking [8.345632941376673]
Adversarial Weight Masking (AWM) is a novel method capable of erasing the neural backdoors even in the one-shot setting.
AWM can largely improve the purifying effects over other state-of-the-art methods on various available training dataset sizes.
arXiv Detail & Related papers (2022-07-10T16:18:39Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs)
The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate.
We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z) - Single-step Adversarial training with Dropout Scheduling [59.50324605982158]
We show that models trained using single-step adversarial training method learn to prevent the generation of single-step adversaries.
Models trained using proposed single-step adversarial training method are robust against both single-step and multi-step adversarial attacks.
arXiv Detail & Related papers (2020-04-18T14:14:00Z) - Applying Tensor Decomposition to image for Robustness against
Adversarial Attack [3.347059384111439]
It can easily fool the deep learning model by adding small perturbations.
In this paper, we suggest combining tensor decomposition for defending the model against adversarial example.
arXiv Detail & Related papers (2020-02-28T18:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.