Related papers: New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class

New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class

URL: http://arxiv.org/abs/2410.12671v2
Date: Tue, 27 May 2025 02:55:37 GMT
Title: New Paradigm of Adversarial Training: Releasing Accuracy-Robustness Trade-Off via Dummy Class
Authors: Yanyun Wang, Li Liu, Zi Liang, Yi R., Fung, Qingqing Ye, Haibo Hu,
Abstract summary: Adversarial Training (AT) is one of the most effective methods to enhance the robustness of Deep Neural Networks (DNNs)<n>Existing AT methods suffer from an inherent accuracy-robustness trade-off.<n>We propose a new AT paradigm by introducing an additional dummy class for each original class.
Score: 10.937432407005124
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial Training (AT) is one of the most effective methods to enhance the robustness of Deep Neural Networks (DNNs). However, existing AT methods suffer from an inherent accuracy-robustness trade-off. Previous works have studied this issue under the current AT paradigm, but still face over 10% accuracy reduction without significant robustness improvement over simple baselines such as PGD-AT. This inherent trade-off raises a question: Whether the current AT paradigm, which assumes to learn corresponding benign and adversarial samples as the same class, inappropriately mixes clean and robust objectives that may be essentially inconsistent. In fact, our empirical results show that up to 40% of CIFAR-10 adversarial samples always fail to satisfy such an assumption across various AT methods and robust models, explicitly indicating the room for improvement of the current AT paradigm. To relax from this overstrict assumption and the tension between clean and robust learning, in this work, we propose a new AT paradigm by introducing an additional dummy class for each original class, aiming to accommodate hard adversarial samples with shifted distribution after perturbation. The robustness w.r.t. these adversarial samples can be achieved by runtime recovery from the predicted dummy classes to the corresponding original ones, without conflicting with the clean objective on accuracy of benign samples. Finally, based on our new paradigm, we propose a novel DUmmy Classes-based Adversarial Training (DUCAT) method that concurrently improves accuracy and robustness in a plug-and-play manner only relevant to logits, loss, and a proposed two-hot soft label-based supervised signal. Our method outperforms state-of-the-art (SOTA) benchmarks, effectively releasing the current trade-off. The code is available at https://github.com/FlaAI/DUCAT.

Related papers

Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled. Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z)
FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification [10.911464455072391]
FACTUAL is a Contrastive Learning framework for Adversarial Training and robust SAR classification. Our model achieves 99.7% accuracy on clean samples, and 89.6% on perturbed samples, both outperforming previous state-of-the-art methods.
arXiv Detail & Related papers (2024-04-04T06:20:22Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Adversarial Feature Alignment: Balancing Robustness and Accuracy in Deep Learning via Adversarial Training [10.099179580467737]
Adversarial training is used to mitigate this problem by increasing robustness against adversarial attacks. This approach typically reduces a model's standard accuracy on clean, non-adversarial samples. This paper proposes a novel adversarial training method called Adversarial Feature Alignment (AFA) to address these problems.
arXiv Detail & Related papers (2024-02-19T14:51:20Z)
Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice. We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z)
Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z)
Advancing Adversarial Robustness Through Adversarial Logit Update [10.041289551532804]
Adversarial training and adversarial purification are among the most widely recognized defense strategies. We propose a new principle, namely Adversarial Logit Update (ALU), to infer adversarial sample's labels. Our solution achieves superior performance compared to state-of-the-art methods against a wide range of adversarial attacks.
arXiv Detail & Related papers (2023-08-29T07:13:31Z)
Carefully Blending Adversarial Training and Purification Improves Adversarial Robustness [1.2289361708127877]
CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for defences. Our method improves by a significant margin the state-of-the-art for CIFAR-10, CIFAR-100, and TinyImageNet-200.
arXiv Detail & Related papers (2023-05-25T09:04:31Z)
Confidence-aware Training of Smoothed Classifiers for Certified Robustness [75.95332266383417]
We use "accuracy under Gaussian noise" as an easy-to-compute proxy of adversarial robustness for an input. Our experiments show that the proposed method consistently exhibits improved certified robustness upon state-of-the-art training methods.
arXiv Detail & Related papers (2022-12-18T03:57:12Z)
Alleviating Robust Overfitting of Adversarial Training With Consistency Regularization [9.686724616328874]
Adversarial training (AT) has proven to be one of the most effective ways to defend Deep Neural Networks (DNNs) against adversarial attacks. robustness will drop sharply at a certain stage, always exists during AT. consistency regularization, a popular technique in semi-supervised learning, has a similar goal as AT and can be used to alleviate robust overfitting.
arXiv Detail & Related papers (2022-05-24T03:18:43Z)
A Unified Wasserstein Distributional Robustness Framework for Adversarial Training [24.411703133156394]
This paper presents a unified framework that connects Wasserstein distributional robustness with current state-of-the-art AT methods. We introduce a new Wasserstein cost function and a new series of risk functions, with which we show that standard AT methods are special cases of their counterparts in our framework. This connection leads to an intuitive relaxation and generalization of existing AT methods and facilitates the development of a new family of distributional robustness AT-based algorithms.
arXiv Detail & Related papers (2022-02-27T19:40:29Z)
SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness [61.212486108346695]
We propose a training scheme, coined SmoothMix, to control the robustness of smoothed classifiers via self-mixup. The proposed procedure effectively identifies over-confident, near off-class samples as a cause of limited robustness. Our experimental results demonstrate that the proposed method can significantly improve the certified $ell$-robustness of smoothed classifiers.
arXiv Detail & Related papers (2021-11-17T18:20:59Z)
Adversarial Training with Rectified Rejection [114.83821848791206]
We propose to use true confidence (T-Con) as a certainty oracle, and learn to predict T-Con by rectifying confidence. We prove that under mild conditions, a rectified confidence (R-Con) rejector and a confidence rejector can be coupled to distinguish any wrongly classified input from correctly classified ones.
arXiv Detail & Related papers (2021-05-31T08:24:53Z)
Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness. We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z)
Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free [115.81899803240758]
Adversarial training and its many variants substantially improve deep network robustness, yet at the cost of compromising standard accuracy. This paper asks how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies. Our proposed framework, Once-for-all Adversarial Training (OAT), is built on an innovative model-conditional training framework.
arXiv Detail & Related papers (2020-10-22T16:06:34Z)
Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models. In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure. We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.