Toward Adversarial Training on Contextualized Language Representation
- URL: http://arxiv.org/abs/2305.04557v1
- Date: Mon, 8 May 2023 08:56:51 GMT
- Title: Toward Adversarial Training on Contextualized Language Representation
- Authors: Hongqiu Wu, Yongxiang Liu, Hanwen Shi, Hai Zhao, Min Zhang
- Abstract summary: This paper investigates adversarial training (AT) from the perspective of the contextualized language representation outputted by PLM encoders.
We propose textitContextualized representation-Adversarial Training (CreAT) in which the attack is explicitly optimized to deviate the contextualized representation of the encoder.
CreAT produces consistent performance gains on a wider range of tasks and is proven to be more effective for language pre-training where only the encoder part is kept for downstream tasks.
- Score: 78.39805974043321
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Beyond the success story of adversarial training (AT) in the recent text
domain on top of pre-trained language models (PLMs), our empirical study
showcases the inconsistent gains from AT on some tasks, e.g. commonsense
reasoning, named entity recognition. This paper investigates AT from the
perspective of the contextualized language representation outputted by PLM
encoders. We find the current AT attacks lean to generate sub-optimal
adversarial examples that can fool the decoder part but have a minor effect on
the encoder. However, we find it necessary to effectively deviate the latter
one to allow AT to gain. Based on the observation, we propose simple yet
effective \textit{Contextualized representation-Adversarial Training} (CreAT),
in which the attack is explicitly optimized to deviate the contextualized
representation of the encoder. It allows a global optimization of adversarial
examples that can fool the entire model. We also find CreAT gives rise to a
better direction to optimize the adversarial examples, to let them less
sensitive to hyperparameters. Compared to AT, CreAT produces consistent
performance gains on a wider range of tasks and is proven to be more effective
for language pre-training where only the encoder part is kept for downstream
tasks. We achieve the new state-of-the-art performances on a series of
challenging benchmarks, e.g. AdvGLUE (59.1 $ \rightarrow $ 61.1), HellaSWAG
(93.0 $ \rightarrow $ 94.9), ANLI (68.1 $ \rightarrow $ 69.3).
Related papers
- A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers [10.063169009242682]
We train an encoder-decoder paraphrase model to generate adversarial examples.
We adopt a reinforcement learning algorithm and propose a constraint-enforcing reward.
We show how key design choices impact the generated examples and discuss the strengths and weaknesses of the proposed approach.
arXiv Detail & Related papers (2024-05-20T09:33:43Z) - $\textit{LinkPrompt}$: Natural and Universal Adversarial Attacks on Prompt-based Language Models [13.416624729344477]
Prompt-based learning is a new language model training paradigm that adapts the Pre-trained Language Models (PLMs) to downstream tasks.
In this work, we develop $textitLinkPrompt$, an adversarial attack algorithm to generate adversarial triggers.
arXiv Detail & Related papers (2024-03-25T05:27:35Z) - TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at
Scale [59.01246141215051]
We analyze the factor that leads to degradation from the perspective of language supervision.
We propose a tunable-free pre-training strategy to retain the generalization ability of the text encoder.
We produce a series of models, dubbed TVTSv2, with up to one billion parameters.
arXiv Detail & Related papers (2023-05-23T15:44:56Z) - Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue.
Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z) - Bag of Tricks for Effective Language Model Pretraining and Downstream
Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard.
GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference.
With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z) - Language-Driven Anchors for Zero-Shot Adversarial Robustness [25.160195547250655]
We propose a Language-driven, Anchor-based Adversarial Training strategy.
By leveraging the semantic consistency of the text encoders, LAAT aims to enhance the adversarial robustness of the image model.
We show that LAAT significantly improves zero-shot adversarial robustness over state-of-the-art methods.
arXiv Detail & Related papers (2023-01-30T17:34:43Z) - Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial
Attack Framework [17.17479625646699]
We propose a unified framework to craft textual adversarial samples.
In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD)
arXiv Detail & Related papers (2021-10-28T17:31:51Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Defending Pre-trained Language Models from Adversarial Word
Substitutions Without Performance Sacrifice [42.490810188180546]
adversarial word substitution is one of the most challenging textual adversarial attack methods.
This paper presents a compact and performance-preserved framework, Anomaly Detection with Frequency-Aware Randomization (ADFAR)
We show that ADFAR significantly outperforms those newly proposed defense methods over various tasks with much higher inference speed.
arXiv Detail & Related papers (2021-05-30T14:24:53Z) - Towards Variable-Length Textual Adversarial Attacks [68.27995111870712]
It is non-trivial to conduct textual adversarial attacks on natural language processing tasks due to the discreteness of data.
In this paper, we propose variable-length textual adversarial attacks(VL-Attack)
Our method can achieve $33.18$ BLEU score on IWSLT14 German-English translation, achieving an improvement of $1.47$ over the baseline model.
arXiv Detail & Related papers (2021-04-16T14:37:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.