Reliably fast adversarial training via latent adversarial perturbation
- URL: http://arxiv.org/abs/2104.01575v1
- Date: Sun, 4 Apr 2021 09:47:38 GMT
- Title: Reliably fast adversarial training via latent adversarial perturbation
- Authors: Geon Yeong Park, Sang Wan Lee
- Abstract summary: A single-step latent adversarial training method is proposed to mitigate the above-mentioned overhead cost.
Despite its structural simplicity, the proposed method outperforms state-of-the-art accelerated adversarial training methods.
- Score: 5.444459446244819
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While multi-step adversarial training is widely popular as an effective
defense method against strong adversarial attacks, its computational cost is
notoriously expensive, compared to standard training. Several single-step
adversarial training methods have been proposed to mitigate the above-mentioned
overhead cost; however, their performance is not sufficiently reliable
depending on the optimization setting. To overcome such limitations, we deviate
from the existing input-space-based adversarial training regime and propose a
single-step latent adversarial training method (SLAT), which leverages the
gradients of latent representation as the latent adversarial perturbation. We
demonstrate that the L1 norm of feature gradients is implicitly regularized
through the adopted latent perturbation, thereby recovering local linearity and
ensuring reliable performance, compared to the existing single-step adversarial
training methods. Because latent perturbation is based on the gradients of the
latent representations which can be obtained for free in the process of input
gradients computation, the proposed method costs roughly the same time as the
fast gradient sign method. Experiment results demonstrate that the proposed
method, despite its structural simplicity, outperforms state-of-the-art
accelerated adversarial training methods.
Related papers
- Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model.
We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z) - Conflict-Aware Adversarial Training [29.804312958830636]
We argue that the weighted-average method does not provide the best tradeoff for the standard performance and adversarial robustness.
We propose a new trade-off paradigm for adversarial training with a conflict-aware factor for the convex combination of standard and adversarial loss, named textbfConflict-Aware Adrial Training(CA-AT)
arXiv Detail & Related papers (2024-10-21T23:44:03Z) - Revisiting and Exploring Efficient Fast Adversarial Training via LAW:
Lipschitz Regularization and Auto Weight Averaging [73.78965374696608]
We study over 10 fast adversarial training methods in terms of adversarial robustness and training costs.
We revisit the effectiveness and efficiency of fast adversarial training techniques in preventing Catastrophic Overfitting.
We propose a FGSM-based fast adversarial training method equipped with Lipschitz regularization and Auto Weight averaging.
arXiv Detail & Related papers (2023-08-22T13:50:49Z) - Improving Fast Adversarial Training with Prior-Guided Knowledge [80.52575209189365]
We investigate the relationship between adversarial example quality and catastrophic overfitting by comparing the training processes of standard adversarial training and Fast adversarial training.
We find that catastrophic overfitting occurs when the attack success rate of adversarial examples becomes worse.
arXiv Detail & Related papers (2023-04-01T02:18:12Z) - Robust Upper Bounds for Adversarial Training [4.971729553254843]
We introduce a new approach to adversarial training by minimizing an upper bound of the adversarial loss.
This bound is based on a holistic expansion of the network instead of separate bounds for each layer.
We derive two new methods with the proposed approach.
arXiv Detail & Related papers (2021-12-17T01:52:35Z) - Contrastive Learning for Fair Representations [50.95604482330149]
Trained classification models can unintentionally lead to biased representations and predictions.
Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise.
We propose a method for mitigating bias by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations.
arXiv Detail & Related papers (2021-09-22T10:47:51Z) - Adaptive perturbation adversarial training: based on reinforcement
learning [9.563820241076103]
One of the shortcomings of adversarial training is that it will reduce the recognition accuracy of normal samples.
Adaptive adversarial training is proposed to alleviate this problem.
It uses marginal adversarial samples that are close to the decision boundary but does not cross the decision boundary for adversarial training.
arXiv Detail & Related papers (2021-08-30T13:49:55Z) - Robust Single-step Adversarial Training with Regularizer [11.35007968593652]
We propose a novel Fast Gradient Sign Method with PGD Regularization (FGSMPR) to boost the efficiency of adversarial training without catastrophic overfitting.
Experiments demonstrate that our proposed method can train a robust deep network for L$_infty$-perturbations with FGSM adversarial training.
arXiv Detail & Related papers (2021-02-05T19:07:10Z) - Efficient Robust Training via Backward Smoothing [125.91185167854262]
Adversarial training is the most effective strategy in defending against adversarial examples.
It suffers from high computational costs due to the iterative adversarial attacks in each training step.
Recent studies show that it is possible to achieve fast Adversarial Training by performing a single-step attack.
arXiv Detail & Related papers (2020-10-03T04:37:33Z) - Single-step Adversarial training with Dropout Scheduling [59.50324605982158]
We show that models trained using single-step adversarial training method learn to prevent the generation of single-step adversaries.
Models trained using proposed single-step adversarial training method are robust against both single-step and multi-step adversarial attacks.
arXiv Detail & Related papers (2020-04-18T14:14:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.