A3T: Accuracy Aware Adversarial Training
- URL: http://arxiv.org/abs/2211.16316v1
- Date: Tue, 29 Nov 2022 15:56:43 GMT
- Title: A3T: Accuracy Aware Adversarial Training
- Authors: Enes Altinisik, Safa Messaoud, Husrev Taha Sencar, Sanjay Chawla
- Abstract summary: We identify one cause of overfitting related to current practices of generating adversarial samples from misclassified samples.
We show that our approach achieves better generalization while having comparable robustness to state-of-the-art adversarial training methods.
- Score: 22.42867682734154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training has been empirically shown to be more prone to
overfitting than standard training. The exact underlying reasons still need to
be fully understood. In this paper, we identify one cause of overfitting
related to current practices of generating adversarial samples from
misclassified samples. To address this, we propose an alternative approach that
leverages the misclassified samples to mitigate the overfitting problem. We
show that our approach achieves better generalization while having comparable
robustness to state-of-the-art adversarial training methods on a wide range of
computer vision, natural language processing, and tabular tasks.
Related papers
- Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing [55.791818510796645]
We aim to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data.
Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge.
We adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain.
arXiv Detail & Related papers (2024-10-08T12:26:48Z) - Understanding prompt engineering may not require rethinking
generalization [56.38207873589642]
We show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are remarkably tight by the standards of the literature.
This work provides a possible justification for the widespread practice of prompt engineering.
arXiv Detail & Related papers (2023-10-06T00:52:48Z) - Confidence-aware Training of Smoothed Classifiers for Certified
Robustness [75.95332266383417]
We use "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input.
Our experiments show that the proposed method consistently exhibits improved certified robustness upon state-of-the-art training methods.
arXiv Detail & Related papers (2022-12-18T03:57:12Z) - The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for
Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples.
We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart.
Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z) - Adversarial Coreset Selection for Efficient Robust Training [11.510009152620666]
We show how selecting a small subset of training data provides a principled approach to reducing the time complexity of robust training.
We conduct extensive experiments to demonstrate that our approach speeds up adversarial training by 2-3 times.
arXiv Detail & Related papers (2022-09-13T07:37:53Z) - Robust Textual Embedding against Word-level Adversarial Attacks [15.235449552083043]
We propose a novel robust training method, termed Fast Triplet Metric Learning (FTML)
We show that FTML can significantly promote the model robustness against various advanced adversarial attacks.
Our work shows the great potential of improving the textual robustness through robust word embedding.
arXiv Detail & Related papers (2022-02-28T14:25:00Z) - On the Impact of Hard Adversarial Instances on Overfitting in
Adversarial Training [72.95029777394186]
Adversarial training is a popular method to robustify models against adversarial attacks.
We investigate this phenomenon from the perspective of training instances.
We show that the decay in generalization performance of adversarial training is a result of the model's attempt to fit hard adversarial instances.
arXiv Detail & Related papers (2021-12-14T12:19:24Z) - Adaptive perturbation adversarial training: based on reinforcement
learning [9.563820241076103]
One of the shortcomings of adversarial training is that it will reduce the recognition accuracy of normal samples.
Adaptive adversarial training is proposed to alleviate this problem.
It uses marginal adversarial samples that are close to the decision boundary but does not cross the decision boundary for adversarial training.
arXiv Detail & Related papers (2021-08-30T13:49:55Z) - Exploring Memorization in Adversarial Training [58.38336773082818]
We investigate the memorization effect in adversarial training (AT) for promoting a deeper understanding of capacity, convergence, generalization, and especially robust overfitting.
We propose a new mitigation algorithm motivated by detailed memorization analyses.
arXiv Detail & Related papers (2021-06-03T05:39:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.