How Does Adversarial Fine-Tuning Benefit BERT?
- URL: http://arxiv.org/abs/2108.13602v1
- Date: Tue, 31 Aug 2021 03:39:06 GMT
- Title: How Does Adversarial Fine-Tuning Benefit BERT?
- Authors: Javid Ebrahimi, Hao Yang, Wei Zhang
- Abstract summary: Adversarial training is one of the most reliable methods for defending against adversarial attacks in machine learning.
We show that adversarially fine-tuned models remain more faithful to BERT's language modeling behavior and are more sensitive to the word order.
Our analysis demonstrates that vanilla fine-tuning oversimplifies the sentence representation by focusing heavily on one or a few label-indicative words.
- Score: 16.57274211257757
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial training (AT) is one of the most reliable methods for defending
against adversarial attacks in machine learning. Variants of this method have
been used as regularization mechanisms to achieve SOTA results on NLP
benchmarks, and they have been found to be useful for transfer learning and
continual learning. We search for the reasons for the effectiveness of AT by
contrasting vanilla and adversarially fine-tuned BERT models. We identify
partial preservation of BERT's syntactic abilities during fine-tuning as the
key to the success of AT. We observe that adversarially fine-tuned models
remain more faithful to BERT's language modeling behavior and are more
sensitive to the word order. As concrete examples of syntactic abilities, an
adversarially fine-tuned model could have an advantage of up to 38% on anaphora
agreement and up to 11% on dependency parsing. Our analysis demonstrates that
vanilla fine-tuning oversimplifies the sentence representation by focusing
heavily on one or a few label-indicative words. AT, however, moderates the
effect of these influential words and encourages representational diversity.
This allows for a more hierarchical representation of a sentence and leads to
the mitigation of BERT's loss of syntactic abilities.
Related papers
- In and Out-of-Domain Text Adversarial Robustness via Label Smoothing [64.66809713499576]
We study the adversarial robustness provided by various label smoothing strategies in foundational models for diverse NLP tasks.
Our experiments show that label smoothing significantly improves adversarial robustness in pre-trained models like BERT, against various popular attacks.
We also analyze the relationship between prediction confidence and robustness, showing that label smoothing reduces over-confident errors on adversarial examples.
arXiv Detail & Related papers (2022-12-20T14:06:50Z) - Subject Verb Agreement Error Patterns in Meaningless Sentences: Humans
vs. BERT [64.40111510974957]
We test whether meaning interferes with subject-verb number agreement in English.
We generate semantically well-formed and nonsensical items.
We find that BERT and humans are both sensitive to our semantic manipulation.
arXiv Detail & Related papers (2022-09-21T17:57:23Z) - Does BERT really agree ? Fine-grained Analysis of Lexical Dependence on
a Syntactic Task [70.29624135819884]
We study the extent to which BERT is able to perform lexically-independent subject-verb number agreement (NA) on targeted syntactic templates.
Our results on nonce sentences suggest that the model generalizes well for simple templates, but fails to perform lexically-independent syntactic generalization when as little as one attractor is present.
arXiv Detail & Related papers (2022-04-14T11:33:15Z) - How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial
Robustness? [121.57551065856164]
We propose Robust Informative Fine-Tuning (RIFT) as a novel adversarial fine-tuning method from an information-theoretical perspective.
RIFT encourages an objective model to retain the features learned from the pre-trained model throughout the entire fine-tuning process.
Experimental results show that RIFT consistently outperforms the state-of-the-arts on two popular NLP tasks.
arXiv Detail & Related papers (2021-12-22T05:04:41Z) - Self-Supervised Contrastive Learning with Adversarial Perturbations for
Robust Pretrained Language Models [18.726529370845256]
This paper improves the robustness of the pretrained language model BERT against word substitution-based adversarial attacks.
We also create an adversarial attack for word-level adversarial training on BERT.
arXiv Detail & Related papers (2021-07-15T21:03:34Z) - Self-Guided Contrastive Learning for BERT Sentence Representations [19.205754738851546]
We propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations.
Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation, and enables the usual [] token embeddings to function as sentence vectors.
arXiv Detail & Related papers (2021-06-03T05:52:43Z) - Syntactic Structure Distillation Pretraining For Bidirectional Encoders [49.483357228441434]
We introduce a knowledge distillation strategy for injecting syntactic biases into BERT pretraining.
We distill the approximate marginal distribution over words in context from the syntactic LM.
Our findings demonstrate the benefits of syntactic biases, even in representation learners that exploit large amounts of data.
arXiv Detail & Related papers (2020-05-27T16:44:01Z) - BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images)
We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples.
Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z) - Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature
and PRESupposition [17.642255516887968]
Natural language inference (NLI) is an increasingly important task for natural language understanding.
The ability of NLI models to make pragmatic inferences remains understudied.
We evaluate whether BERT, InferSent, and BOW NLI models trained on MultiNLI learn to make pragmatic inferences.
arXiv Detail & Related papers (2020-04-07T01:20:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.