Virtual Data Augmentation: A Robust and General Framework for
Fine-tuning Pre-trained Models
- URL: http://arxiv.org/abs/2109.05793v1
- Date: Mon, 13 Sep 2021 09:15:28 GMT
- Title: Virtual Data Augmentation: A Robust and General Framework for
Fine-tuning Pre-trained Models
- Authors: Kun Zhou, Wayne Xin Zhao, Sirui Wang, Fuzheng Zhang, Wei Wu and
Ji-Rong Wen
- Abstract summary: Powerful pre-trained language models (PLM) can be fooled by small perturbations or intentional attacks.
We present Virtual Data Augmentation (VDA), a general framework for robustly fine-tuning PLMs.
Our approach is able to improve the robustness of PLMs and alleviate the performance degradation under adversarial attacks.
- Score: 51.46732511844122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works have shown that powerful pre-trained language models (PLM) can
be fooled by small perturbations or intentional attacks. To solve this issue,
various data augmentation techniques are proposed to improve the robustness of
PLMs. However, it is still challenging to augment semantically relevant
examples with sufficient diversity. In this work, we present Virtual Data
Augmentation (VDA), a general framework for robustly fine-tuning PLMs. Based on
the original token embeddings, we construct a multinomial mixture for
augmenting virtual data embeddings, where a masked language model guarantees
the semantic relevance and the Gaussian noise provides the augmentation
diversity. Furthermore, a regularized training strategy is proposed to balance
the two aspects. Extensive experiments on six datasets show that our approach
is able to improve the robustness of PLMs and alleviate the performance
degradation under adversarial attacks. Our codes and data are publicly
available at \textcolor{blue}{\url{https://github.com/RUCAIBox/VDA}}.
Related papers
- Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content [62.685566387625975]
Current mitigation strategies, while effective, are not resilient under adversarial attacks.
This paper introduces Resilient Guardrails for Large Language Models (RigorLLM), a novel framework designed to efficiently moderate harmful and unsafe inputs.
arXiv Detail & Related papers (2024-03-19T07:25:02Z) - DiffClass: Diffusion-Based Class Incremental Learning [30.514281721324853]
Class Incremental Learning (CIL) is challenging due to catastrophic forgetting.
Recent exemplar-free CIL methods attempt to mitigate catastrophic forgetting by synthesizing previous task data.
We propose a novel exemplar-free CIL method to overcome these issues.
arXiv Detail & Related papers (2024-03-08T03:34:18Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Towards General Visual-Linguistic Face Forgery Detection [95.73987327101143]
Deepfakes are realistic face manipulations that can pose serious threats to security, privacy, and trust.
Existing methods mostly treat this task as binary classification, which uses digital labels or mask signals to train the detection model.
We propose a novel paradigm named Visual-Linguistic Face Forgery Detection(VLFFD), which uses fine-grained sentence-level prompts as the annotation.
arXiv Detail & Related papers (2023-07-31T10:22:33Z) - Fine-Tuning Pre-Trained Language Models Effectively by Optimizing
Subnetworks Adaptively [32.001304911395756]
We propose a Dynamic Selection (DPS) algorithm for the large-scale pre-trained models during fine-tuning.
Experiments on the GLUE benchmark show that DPS outperforms previous fine-tuning methods in terms of overall performance and stability.
arXiv Detail & Related papers (2022-11-03T08:32:12Z) - Discrete Auto-regressive Variational Attention Models for Text Modeling [53.38382932162732]
Variational autoencoders (VAEs) have been widely applied for text modeling.
They are troubled by two challenges: information underrepresentation and posterior collapse.
We propose Discrete Auto-regressive Variational Attention Model (DAVAM) to address the challenges.
arXiv Detail & Related papers (2021-06-16T06:36:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.