Pretrained Transformers Do not Always Improve Robustness
- URL: http://arxiv.org/abs/2210.07663v1
- Date: Fri, 14 Oct 2022 09:30:36 GMT
- Title: Pretrained Transformers Do not Always Improve Robustness
- Authors: Swaroop Mishra, Bhavdeep Singh Sachdeva, Chitta Baral
- Abstract summary: We show that PT provide less robust representation than traditional models on exposure to noisy data.
We augment PT with an adversarial filtering mechanism that has been shown to improve OOD generalization.
However, increase in generalization does not necessarily increase robustness, as we find that noisy data fools the AF method powered by PT.
- Score: 23.227505403565903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained Transformers (PT) have been shown to improve Out of Distribution
(OOD) robustness than traditional models such as Bag of Words (BOW), LSTMs,
Convolutional Neural Networks (CNN) powered by Word2Vec and Glove embeddings.
How does the robustness comparison hold in a real world setting where some part
of the dataset can be noisy? Do PT also provide more robust representation than
traditional models on exposure to noisy data? We perform a comparative study on
10 models and find an empirical evidence that PT provide less robust
representation than traditional models on exposure to noisy data. We
investigate further and augment PT with an adversarial filtering (AF) mechanism
that has been shown to improve OOD generalization. However, increase in
generalization does not necessarily increase robustness, as we find that noisy
data fools the AF method powered by PT.
Related papers
- Robust VAEs via Generating Process of Noise Augmented Data [9.366139389037489]
This paper introduces a novel framework that enhances robustness by regularizing the latent space divergence between original and noise-augmented data.
Our empirical evaluations demonstrate that this approach, termed Robust Augmented Variational Auto-ENcoder (RAVEN), yields superior performance in resisting adversarial inputs.
arXiv Detail & Related papers (2024-07-26T09:55:34Z) - PUMA: margin-based data pruning [51.12154122266251]
We focus on data pruning, where some training samples are removed based on the distance to the model classification boundary (i.e., margin)
We propose PUMA, a new data pruning strategy that computes the margin using DeepFool.
We show that PUMA can be used on top of the current state-of-the-art methodology in robustness, and it is able to significantly improve the model performance unlike the existing data pruning strategies.
arXiv Detail & Related papers (2024-05-10T08:02:20Z) - Did Translation Models Get More Robust Without Anyone Even Noticing? [11.342084260983668]
We show that multilingual MT models and large language models (LLMs) are far more robust to many kinds of noise than previous models.
Similar trends hold for social media translation experiments -- LLMs are more robust to social media text.
arXiv Detail & Related papers (2024-03-06T18:33:51Z) - Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency
Augmentation in Image Classification [3.129187821625805]
Auxiliary Fourier-basis Augmentation (AFA) is a technique targeting augmentation in the frequency domain and filling the augmentation gap left by visual augmentations.
Our results show that AFA benefits the robustness of models against common corruptions, OOD generalization, and consistency of performance of models against increasing perturbations, with negligible deficit to the standard performance of models.
arXiv Detail & Related papers (2024-03-04T11:30:02Z) - A Curious Case of Searching for the Correlation between Training Data and Adversarial Robustness of Transformer Textual Models [11.938237087895649]
Existing works have shown that fine-tuned textual transformer models achieve state-of-the-art prediction performances but are also vulnerable to adversarial text perturbations.
In this paper, we want to prove that there is also a strong correlation between training data and model robustness.
We extract 13 different features representing a wide range of input fine-tuning corpora properties and use them to predict the adversarial robustness of the fine-tuned models.
arXiv Detail & Related papers (2024-02-18T05:58:25Z) - Certified Adversarial Defenses Meet Out-of-Distribution Corruptions:
Benchmarking Robustness and Simple Baselines [65.0803400763215]
This work critically examines how adversarial robustness guarantees change when state-of-the-art certifiably robust models encounter out-of-distribution data.
We propose a novel data augmentation scheme, FourierMix, that produces augmentations to improve the spectral coverage of the training data.
We find that FourierMix augmentations help eliminate the spectral bias of certifiably robust models enabling them to achieve significantly better robustness guarantees on a range of OOD benchmarks.
arXiv Detail & Related papers (2021-12-01T17:11:22Z) - Enhancing Adversarial Robustness via Test-time Transformation Ensembling [51.51139269928358]
We show how equipping models with Test-time Transformation Ensembling can work as a reliable defense against adversarial attacks.
We show that TTE consistently improves model robustness against a variety of powerful attacks without any need for re-training.
arXiv Detail & Related papers (2021-07-29T15:32:35Z) - On the Adversarial Robustness of Visual Transformers [129.29523847765952]
This work provides the first and comprehensive study on the robustness of vision transformers (ViTs) against adversarial perturbations.
Tested on various white-box and transfer attack settings, we find that ViTs possess better adversarial robustness when compared with convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-03-29T14:48:24Z) - From Sound Representation to Model Robustness [82.21746840893658]
We investigate the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network.
Averaged over various experiments on three environmental sound datasets, we found the ResNet-18 model outperforms other deep learning architectures.
arXiv Detail & Related papers (2020-07-27T17:30:49Z) - Pretrained Transformers Improve Out-of-Distribution Robustness [72.38747394482247]
We measure out-of-distribution generalization for seven NLP datasets.
We show that pretrained Transformers' performance declines are substantially smaller.
We examine which factors affect robustness, finding that larger models are not necessarily more robust.
arXiv Detail & Related papers (2020-04-13T17:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.