Related papers: Toward Adversarial Training on Contextualized Language Representation

Toward Adversarial Training on Contextualized Language Representation

URL: http://arxiv.org/abs/2305.04557v1
Date: Mon, 8 May 2023 08:56:51 GMT
Title: Toward Adversarial Training on Contextualized Language Representation
Authors: Hongqiu Wu, Yongxiang Liu, Hanwen Shi, Hai Zhao, Min Zhang
Abstract summary: This paper investigates adversarial training (AT) from the perspective of the contextualized language representation outputted by PLM encoders. We propose textitContextualized representation-Adversarial Training (CreAT) in which the attack is explicitly optimized to deviate the contextualized representation of the encoder. CreAT produces consistent performance gains on a wider range of tasks and is proven to be more effective for language pre-training where only the encoder part is kept for downstream tasks.
Score: 78.39805974043321
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Beyond the success story of adversarial training (AT) in the recent text domain on top of pre-trained language models (PLMs), our empirical study showcases the inconsistent gains from AT on some tasks, e.g. commonsense reasoning, named entity recognition. This paper investigates AT from the perspective of the contextualized language representation outputted by PLM encoders. We find the current AT attacks lean to generate sub-optimal adversarial examples that can fool the decoder part but have a minor effect on the encoder. However, we find it necessary to effectively deviate the latter one to allow AT to gain. Based on the observation, we propose simple yet effective \textit{Contextualized representation-Adversarial Training} (CreAT), in which the attack is explicitly optimized to deviate the contextualized representation of the encoder. It allows a global optimization of adversarial examples that can fool the entire model. We also find CreAT gives rise to a better direction to optimize the adversarial examples, to let them less sensitive to hyperparameters. Compared to AT, CreAT produces consistent performance gains on a wider range of tasks and is proven to be more effective for language pre-training where only the encoder part is kept for downstream tasks. We achieve the new state-of-the-art performances on a series of challenging benchmarks, e.g. AdvGLUE (59.1 $ \rightarrow $ 61.1), HellaSWAG (93.0 $ \rightarrow $ 94.9), ANLI (68.1 $ \rightarrow $ 69.3).

Related papers

NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts [57.53692236201343]
We propose a Multi-Task Correction MoE, where we train the experts to become an expert'' of speech-to-text, language-to-text and vision-to-text datasets. NeKo performs competitively on grammar and post-OCR correction as a multi-task model.
arXiv Detail & Related papers (2024-11-08T20:11:24Z)
A Constraint-Enforcing Reward for Adversarial Attacks on Text Classifiers [10.063169009242682]
We train an encoder-decoder paraphrase model to generate adversarial examples. We adopt a reinforcement learning algorithm and propose a constraint-enforcing reward. We show how key design choices impact the generated examples and discuss the strengths and weaknesses of the proposed approach.
arXiv Detail & Related papers (2024-05-20T09:33:43Z)
$\textit{LinkPrompt}$: Natural and Universal Adversarial Attacks on Prompt-based Language Models [13.416624729344477]
Prompt-based learning is a new language model training paradigm that adapts the Pre-trained Language Models (PLMs) to downstream tasks. In this work, we develop $textitLinkPrompt$, an adversarial attack algorithm to generate adversarial triggers.
arXiv Detail & Related papers (2024-03-25T05:27:35Z)
TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale [59.01246141215051]
We analyze the factor that leads to degradation from the perspective of language supervision. We propose a tunable-free pre-training strategy to retain the generalization ability of the text encoder. We produce a series of models, dubbed TVTSv2, with up to one billion parameters.
arXiv Detail & Related papers (2023-05-23T15:44:56Z)
Alleviating Over-smoothing for Unsupervised Sentence Representation [96.19497378628594]
We present a Simple method named Self-Contrastive Learning (SSCL) to alleviate this issue. Our proposed method is quite simple and can be easily extended to various state-of-the-art models for performance boosting.
arXiv Detail & Related papers (2023-05-09T11:00:02Z)
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE [93.98660272309974]
This report briefly describes our submission Vega v1 on the General Language Understanding Evaluation leaderboard. GLUE is a collection of nine natural language understanding tasks, including question answering, linguistic acceptability, sentiment analysis, text similarity, paraphrase detection, and natural language inference. With our optimized pretraining and fine-tuning strategies, our 1.3 billion model sets new state-of-the-art on 4/9 tasks, achieving the best average score of 91.3.
arXiv Detail & Related papers (2023-02-18T09:26:35Z)
Language-Driven Anchors for Zero-Shot Adversarial Robustness [25.160195547250655]
We propose a Language-driven, Anchor-based Adversarial Training strategy. By leveraging the semantic consistency of the text encoders, LAAT aims to enhance the adversarial robustness of the image model. We show that LAAT significantly improves zero-shot adversarial robustness over state-of-the-art methods.
arXiv Detail & Related papers (2023-01-30T17:34:43Z)
Bridge the Gap Between CV and NLP! A Gradient-based Textual Adversarial Attack Framework [17.17479625646699]
We propose a unified framework to craft textual adversarial samples. In this paper, we instantiate our framework with an attack algorithm named Textual Projected Gradient Descent (T-PGD)
arXiv Detail & Related papers (2021-10-28T17:31:51Z)
Defending Pre-trained Language Models from Adversarial Word Substitutions Without Performance Sacrifice [42.490810188180546]
adversarial word substitution is one of the most challenging textual adversarial attack methods. This paper presents a compact and performance-preserved framework, Anomaly Detection with Frequency-Aware Randomization (ADFAR) We show that ADFAR significantly outperforms those newly proposed defense methods over various tasks with much higher inference speed.
arXiv Detail & Related papers (2021-05-30T14:24:53Z)
Towards Variable-Length Textual Adversarial Attacks [68.27995111870712]
It is non-trivial to conduct textual adversarial attacks on natural language processing tasks due to the discreteness of data. In this paper, we propose variable-length textual adversarial attacks(VL-Attack) Our method can achieve $33.18$ BLEU score on IWSLT14 German-English translation, achieving an improvement of $1.47$ over the baseline model.
arXiv Detail & Related papers (2021-04-16T14:37:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.