Related papers: Boundary Adversarial Examples Against Adversarial Overfitting

Boundary Adversarial Examples Against Adversarial Overfitting

URL: http://arxiv.org/abs/2211.14088v1
Date: Fri, 25 Nov 2022 13:16:53 GMT
Title: Boundary Adversarial Examples Against Adversarial Overfitting
Authors: Muhammad Zaid Hameed, Beat Buesser
Abstract summary: adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long. Several mitigation approaches including early stopping, temporal ensembling and weight memorizations have been proposed to mitigate the effect of robust overfitting. In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance.
Score: 4.391102490444538
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Standard adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long. The origin of this problem is still unclear and conflicting explanations have been reported, i.e., memorization effects induced by large loss data or because of small loss data and growing differences in loss distribution of training samples as the adversarial training progresses. Consequently, several mitigation approaches including early stopping, temporal ensembling and weight perturbations on small loss data have been proposed to mitigate the effect of robust overfitting. However, a side effect of these strategies is a larger reduction in clean accuracy compared to standard adversarial training. In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance. We further propose the use of helper adversarial examples that can be obtained with minimal cost in the adversarial example generation, and show how they increase the clean accuracy in the existing approaches without compromising the robust accuracy.

Related papers

Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
DSRM: Boost Textual Adversarial Training with Distribution Shift Risk Minimization [36.10642858867033]
Adversarial training is one of the best-performing methods in improving the robustness of deep language models. We introduce a novel, effective procedure for instead adversarial training with only clean data. Our approach requires zero adversarial samples for training and reduces time consumption by up to 70% compared to current best-performing adversarial training methods.
arXiv Detail & Related papers (2023-06-27T02:46:08Z)
Understanding and Combating Robust Overfitting via Input Loss Landscape Analysis and Regularization [5.1024659285813785]
Adrial training is prone to overfitting, and the cause is far from clear. We find that robust overfitting results from standard training, specifically the minimization of the clean loss. We propose a new regularizer to smooth the loss landscape by penalizing the weighted logits variation along the adversarial direction.
arXiv Detail & Related papers (2022-12-09T16:55:30Z)
Efficient Adversarial Training With Data Pruning [26.842714298874192]
We show that data pruning leads to improvements in convergence and reliability of adversarial training. In some settings data pruning brings benefits from both worlds-it both improves adversarial accuracy and training time.
arXiv Detail & Related papers (2022-07-01T23:54:46Z)
Understanding Robust Overfitting of Adversarial Training and Beyond [103.37117541210348]
We show that robust overfitting widely exists in adversarial training of deep networks. We propose emphminimum loss constrained adversarial training (MLCAT) In a minibatch, we learn large-loss data as usual, and adopt additional measures to increase the loss of the small-loss data.
arXiv Detail & Related papers (2022-06-17T10:25:17Z)
On the Impact of Hard Adversarial Instances on Overfitting in Adversarial Training [70.82725772926949]
Adversarial training is a popular method to robustify models against adversarial attacks. In this work, we investigate this phenomenon from the perspective of training instances. We show that the decay in generalization performance of adversarial training is a result of fitting hard adversarial instances.
arXiv Detail & Related papers (2021-12-14T12:19:24Z)
Mixing between the Cross Entropy and the Expectation Loss Terms [89.30385901335323]
Cross entropy loss tends to focus on hard to classify samples during training. We show that adding to the optimization goal the expectation loss helps the network to achieve better accuracy. Our experiments show that the new training protocol improves performance across a diverse set of classification domains.
arXiv Detail & Related papers (2021-09-12T23:14:06Z)
Adaptive perturbation adversarial training: based on reinforcement learning [9.563820241076103]
One of the shortcomings of adversarial training is that it will reduce the recognition accuracy of normal samples. Adaptive adversarial training is proposed to alleviate this problem. It uses marginal adversarial samples that are close to the decision boundary but does not cross the decision boundary for adversarial training.
arXiv Detail & Related papers (2021-08-30T13:49:55Z)
Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise. Pre-processing methods may suffer from the robustness degradation effect. A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model. We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z)
Understanding Catastrophic Overfitting in Single-step Adversarial Training [9.560980936110234]
"catastrophic overfitting" is a phenomenon in which the robust accuracy against projected gradient descent suddenly decreases to 0% after a few epochs. We propose a simple method that not only prevents catastrophic overfitting, but also overrides the belief that it is difficult to prevent multi-step adversarial attacks with single-step adversarial training.
arXiv Detail & Related papers (2020-10-05T06:13:35Z)
On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models. We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z)
Precise Tradeoffs in Adversarial Training for Linear Regression [55.764306209771405]
We provide a precise and comprehensive understanding of the role of adversarial training in the context of linear regression with Gaussian features. We precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach. Our theory for adversarial training algorithms also facilitates the rigorous study of how a variety of factors (size and quality of training data, model overparametrization etc.) affect the tradeoff between these two competing accuracies.
arXiv Detail & Related papers (2020-02-24T19:01:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.