Related papers: Reducing Adversarial Training Cost with Gradient Approximation

Reducing Adversarial Training Cost with Gradient Approximation

URL: http://arxiv.org/abs/2309.09464v3
Date: Tue, 10 Oct 2023 05:59:43 GMT
Title: Reducing Adversarial Training Cost with Gradient Approximation
Authors: Huihui Gong
Abstract summary: We propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT) to reduce the cost of building up robust models. Our proposed method saves up to 60% of the training time with comparable model test accuracy on datasets.
Score: 0.3916094706589679
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep learning models have achieved state-of-the-art performances in various domains, while they are vulnerable to the inputs with well-crafted but small perturbations, which are named after adversarial examples (AEs). Among many strategies to improve the model robustness against AEs, Projected Gradient Descent (PGD) based adversarial training is one of the most effective methods. Unfortunately, the prohibitive computational overhead of generating strong enough AEs, due to the maximization of the loss function, sometimes makes the regular PGD adversarial training impractical when using larger and more complicated models. In this paper, we propose that the adversarial loss can be approximated by the partial sum of Taylor series. Furthermore, we approximate the gradient of adversarial loss and propose a new and efficient adversarial training method, adversarial training with gradient approximation (GAAT), to reduce the cost of building up robust models. Additionally, extensive experiments demonstrate that this efficiency improvement can be achieved without any or with very little loss in accuracy on natural and adversarial examples, which show that our proposed method saves up to 60\% of the training time with comparable model test accuracy on MNIST, CIFAR-10 and CIFAR-100 datasets.

Related papers

Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z)
Class Incremental Learning for Adversarial Robustness [17.06592851567578]
Adrial training integrates adversarial examples during model training to enhance robustness. We observe that combining incremental learning with naive adversarial training easily leads to a loss of robustness. We propose the Flatness Preserving Distillation (FPD) loss that leverages the output difference between adversarial and clean examples.
arXiv Detail & Related papers (2023-12-06T04:38:02Z)
Two Heads are Better than One: Robust Learning Meets Multi-branch Models [9.748686016224102]
Adversarial training, a reliable and effective method of defense, may significantly reduce the vulnerability of neural networks.<n>We propose textitBranch Orthogonality adveRsarial Training (BORT) to obtain state-of-the-art performance with solely the original dataset for adversarial training.<n>We evaluate our approach on CIFAR-10, CIFAR-100 and SVHN against $ell_infty$ norm-bounded perturbations of size $epsilon = 8/255$, respectively.
arXiv Detail & Related papers (2022-08-17T05:42:59Z)
Mutual Adversarial Training: Learning together is better than going alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation. We propose mutual adversarial training (MAT) in which multiple models are trained together. MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z)
Adversarial Feature Stacking for Accurate and Robust Predictions [4.208059346198116]
Adversarial Feature Stacking (AFS) model can jointly take advantage of features with varied levels of robustness and accuracy. We evaluate the AFS model on CIFAR-10 and CIFAR-100 datasets with strong adaptive attack methods.
arXiv Detail & Related papers (2021-03-24T12:01:24Z)
Robust Single-step Adversarial Training with Regularizer [11.35007968593652]
We propose a novel Fast Gradient Sign Method with PGD Regularization (FGSMPR) to boost the efficiency of adversarial training without catastrophic overfitting. Experiments demonstrate that our proposed method can train a robust deep network for L$_infty$-perturbations with FGSM adversarial training.
arXiv Detail & Related papers (2021-02-05T19:07:10Z)
A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy. Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z)
Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure. We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)
Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples. Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks. In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
Improving the affordability of robustness training for DNNs [11.971637253035107]
We show that the initial phase of adversarial training is redundant and can be replaced with natural training which significantly improves the computational efficiency. We show that our proposed method can reduce the training time by a factor of up to 2.5 with comparable or better model test accuracy and generalization on various strengths of adversarial attacks.
arXiv Detail & Related papers (2020-02-11T07:29:45Z)
Efficient Adversarial Training with Transferable Adversarial Examples [58.62766224452761]
We show that there is high transferability between models from neighboring epochs in the same training process. We propose a novel method, Adversarial Training with Transferable Adversarial Examples (ATTA) that can enhance the robustness of trained models.
arXiv Detail & Related papers (2019-12-27T03:05:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.