Boundary Adversarial Examples Against Adversarial Overfitting
- URL: http://arxiv.org/abs/2211.14088v1
- Date: Fri, 25 Nov 2022 13:16:53 GMT
- Title: Boundary Adversarial Examples Against Adversarial Overfitting
- Authors: Muhammad Zaid Hameed, Beat Buesser
- Abstract summary: adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long.
Several mitigation approaches including early stopping, temporal ensembling and weight memorizations have been proposed to mitigate the effect of robust overfitting.
In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance.
- Score: 4.391102490444538
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Standard adversarial training approaches suffer from robust overfitting where
the robust accuracy decreases when models are adversarially trained for too
long. The origin of this problem is still unclear and conflicting explanations
have been reported, i.e., memorization effects induced by large loss data or
because of small loss data and growing differences in loss distribution of
training samples as the adversarial training progresses. Consequently, several
mitigation approaches including early stopping, temporal ensembling and weight
perturbations on small loss data have been proposed to mitigate the effect of
robust overfitting. However, a side effect of these strategies is a larger
reduction in clean accuracy compared to standard adversarial training. In this
paper, we investigate if these mitigation approaches are complimentary to each
other in improving adversarial training performance. We further propose the use
of helper adversarial examples that can be obtained with minimal cost in the
adversarial example generation, and show how they increase the clean accuracy
in the existing approaches without compromising the robust accuracy.
Related papers
- Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - DSRM: Boost Textual Adversarial Training with Distribution Shift Risk
Minimization [36.10642858867033]
Adversarial training is one of the best-performing methods in improving the robustness of deep language models.
We introduce a novel, effective procedure for instead adversarial training with only clean data.
Our approach requires zero adversarial samples for training and reduces time consumption by up to 70% compared to current best-performing adversarial training methods.
arXiv Detail & Related papers (2023-06-27T02:46:08Z) - Understanding and Combating Robust Overfitting via Input Loss Landscape
Analysis and Regularization [5.1024659285813785]
Adrial training is prone to overfitting, and the cause is far from clear.
We find that robust overfitting results from standard training, specifically the minimization of the clean loss.
We propose a new regularizer to smooth the loss landscape by penalizing the weighted logits variation along the adversarial direction.
arXiv Detail & Related papers (2022-12-09T16:55:30Z) - Efficient Adversarial Training With Data Pruning [26.842714298874192]
We show that data pruning leads to improvements in convergence and reliability of adversarial training.
In some settings data pruning brings benefits from both worlds-it both improves adversarial accuracy and training time.
arXiv Detail & Related papers (2022-07-01T23:54:46Z) - Understanding Robust Overfitting of Adversarial Training and Beyond [103.37117541210348]
We show that robust overfitting widely exists in adversarial training of deep networks.
We propose emphminimum loss constrained adversarial training (MLCAT)
In a minibatch, we learn large-loss data as usual, and adopt additional measures to increase the loss of the small-loss data.
arXiv Detail & Related papers (2022-06-17T10:25:17Z) - Mixing between the Cross Entropy and the Expectation Loss Terms [89.30385901335323]
Cross entropy loss tends to focus on hard to classify samples during training.
We show that adding to the optimization goal the expectation loss helps the network to achieve better accuracy.
Our experiments show that the new training protocol improves performance across a diverse set of classification domains.
arXiv Detail & Related papers (2021-09-12T23:14:06Z) - Adaptive perturbation adversarial training: based on reinforcement
learning [9.563820241076103]
One of the shortcomings of adversarial training is that it will reduce the recognition accuracy of normal samples.
Adaptive adversarial training is proposed to alleviate this problem.
It uses marginal adversarial samples that are close to the decision boundary but does not cross the decision boundary for adversarial training.
arXiv Detail & Related papers (2021-08-30T13:49:55Z) - Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise.
Pre-processing methods may suffer from the robustness degradation effect.
A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model.
We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z) - Understanding Catastrophic Overfitting in Single-step Adversarial
Training [9.560980936110234]
"catastrophic overfitting" is a phenomenon in which the robust accuracy against projected gradient descent suddenly decreases to 0% after a few epochs.
We propose a simple method that not only prevents catastrophic overfitting, but also overrides the belief that it is difficult to prevent multi-step adversarial attacks with single-step adversarial training.
arXiv Detail & Related papers (2020-10-05T06:13:35Z) - On the Loss Landscape of Adversarial Training: Identifying Challenges
and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models.
We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z) - Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features.
We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach.
Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.