On visual self-supervision and its effect on model robustness
- URL: http://arxiv.org/abs/2112.04367v1
- Date: Wed, 8 Dec 2021 16:22:02 GMT
- Title: On visual self-supervision and its effect on model robustness
- Authors: Michal Kucer, Diane Oyen, Garrett Kenyon
- Abstract summary: Self-supervision can indeed improve model robustness, however it turns out the devil is in the details.
Although self-supervised pre-training yields benefits in improving adversarial training, we observe no benefit in model robustness or accuracy if self-supervision is incorporated into adversarial training.
- Score: 9.313899406300644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent self-supervision methods have found success in learning feature
representations that could rival ones from full supervision, and have been
shown to be beneficial to the model in several ways: for example improving
models robustness and out-of-distribution detection. In our paper, we conduct
an empirical study to understand more precisely in what way can self-supervised
learning - as a pre-training technique or part of adversarial training -
affects model robustness to $l_2$ and $l_{\infty}$ adversarial perturbations
and natural image corruptions. Self-supervision can indeed improve model
robustness, however it turns out the devil is in the details. If one simply
adds self-supervision loss in tandem with adversarial training, then one sees
improvement in accuracy of the model when evaluated with adversarial
perturbations smaller or comparable to the value of $\epsilon_{train}$ that the
robust model is trained with. However, if one observes the accuracy for
$\epsilon_{test} \ge \epsilon_{train}$, the model accuracy drops. In fact, the
larger the weight of the supervision loss, the larger the drop in performance,
i.e. harming the robustness of the model. We identify primary ways in which
self-supervision can be added to adversarial training, and observe that using a
self-supervised loss to optimize both network parameters and find adversarial
examples leads to the strongest improvement in model robustness, as this can be
viewed as a form of ensemble adversarial training. Although self-supervised
pre-training yields benefits in improving adversarial training as compared to
random weight initialization, we observe no benefit in model robustness or
accuracy if self-supervision is incorporated into adversarial training.
Related papers
- Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study [61.65123150513683]
multimodal foundation models, such as CLIP, produce state-of-the-art zero-shot results.
It is reported that these models close the robustness gap by matching the performance of supervised models trained on ImageNet.
We show that CLIP leads to a significant robustness drop compared to supervised ImageNet models on our benchmark.
arXiv Detail & Related papers (2024-03-15T17:33:49Z) - Robustness-Congruent Adversarial Training for Secure Machine Learning
Model Updates [13.911586916369108]
We show that misclassifications in machine-learning models can affect robustness to adversarial examples.
We propose a technique, named robustness-congruent adversarial training, to address this issue.
We show that our algorithm and, more generally, learning with non-regression constraints, provides a theoretically-grounded framework to train consistent estimators.
arXiv Detail & Related papers (2024-02-27T10:37:13Z) - Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Effective Robustness against Natural Distribution Shifts for Models with
Different Training Data [113.21868839569]
"Effective robustness" measures the extra out-of-distribution robustness beyond what can be predicted from the in-distribution (ID) performance.
We propose a new evaluation metric to evaluate and compare the effective robustness of models trained on different data.
arXiv Detail & Related papers (2023-02-02T19:28:41Z) - Explicit Tradeoffs between Adversarial and Natural Distributional
Robustness [48.44639585732391]
In practice, models need to enjoy both types of robustness to ensure reliability.
In this work, we show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness.
arXiv Detail & Related papers (2022-09-15T19:58:01Z) - Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods.
Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space.
We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z) - Mutual Adversarial Training: Learning together is better than going
alone [82.78852509965547]
We study how interactions among models affect robustness via knowledge distillation.
We propose mutual adversarial training (MAT) in which multiple models are trained together.
MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks.
arXiv Detail & Related papers (2021-12-09T15:59:42Z) - Understanding the Logit Distributions of Adversarially-Trained Deep
Neural Networks [6.439477789066243]
Adversarial defenses train deep neural networks to be invariant to the input perturbations from adversarial attacks.
Although adversarial training is successful at mitigating adversarial attacks, the behavioral differences between adversarially-trained (AT) models and standard models are still poorly understood.
We identify three logit characteristics essential to learning adversarial robustness.
arXiv Detail & Related papers (2021-08-26T19:09:15Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - On the Generalization Properties of Adversarial Training [21.79888306754263]
This paper studies the generalization performance of a generic adversarial training algorithm.
A series of numerical studies are conducted to demonstrate how the smoothness and L1 penalization help improve the adversarial robustness of models.
arXiv Detail & Related papers (2020-08-15T02:32:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.