Label Smoothing and Adversarial Robustness
- URL: http://arxiv.org/abs/2009.08233v1
- Date: Thu, 17 Sep 2020 12:36:35 GMT
- Title: Label Smoothing and Adversarial Robustness
- Authors: Chaohao Fu, Hongbin Chen, Na Ruan, Weijia Jia
- Abstract summary: We find that training model with label smoothing can easily achieve striking accuracy under most gradient-based attacks.
Our study enlightens the research community to rethink how to evaluate the model's robustness appropriately.
- Score: 16.804200102767208
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies indicate that current adversarial attack methods are flawed
and easy to fail when encountering some deliberately designed defense.
Sometimes even a slight modification in the model details will invalidate the
attack. We find that training model with label smoothing can easily achieve
striking accuracy under most gradient-based attacks. For instance, the robust
accuracy of a WideResNet model trained with label smoothing on CIFAR-10
achieves 75% at most under PGD attack. To understand the reason underlying the
subtle robustness, we investigate the relationship between label smoothing and
adversarial robustness. Through theoretical analysis about the characteristics
of the network trained with label smoothing and experiment verification of its
performance under various attacks. We demonstrate that the robustness produced
by label smoothing is incomplete based on the fact that its defense effect is
volatile, and it cannot defend attacks transferred from a naturally trained
model. Our study enlightens the research community to rethink how to evaluate
the model's robustness appropriately.
Related papers
- FACTUAL: A Novel Framework for Contrastive Learning Based Robust SAR Image Classification [10.911464455072391]
FACTUAL is a Contrastive Learning framework for Adversarial Training and robust SAR classification.
Our model achieves 99.7% accuracy on clean samples, and 89.6% on perturbed samples, both outperforming previous state-of-the-art methods.
arXiv Detail & Related papers (2024-04-04T06:20:22Z) - In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Evaluating the Robustness of Geometry-Aware Instance-Reweighted
Adversarial Training [9.351384969104771]
We evaluate the robustness of a method called "Geometry-aware Instance-reweighted Adversarial Training"
We find that a network trained with this method is biasing the model towards certain samples by re-scaling the loss.
arXiv Detail & Related papers (2021-03-02T18:15:42Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.