RoAST: Robustifying Language Models via Adversarial Perturbation with
Selective Training
- URL: http://arxiv.org/abs/2312.04032v1
- Date: Thu, 7 Dec 2023 04:23:36 GMT
- Title: RoAST: Robustifying Language Models via Adversarial Perturbation with
Selective Training
- Authors: Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale
Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa
- Abstract summary: We propose Robustifying LMs via Adversarial perturbation with Selective Training (RoAST)
RoAST incorporates two important sources for the model robustness, robustness on the perturbed inputs and generalizable knowledge in pre-trained LMs.
We demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning methods on six different types of LMs.
- Score: 105.02614392553198
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Fine-tuning pre-trained language models (LMs) has become the de facto
standard in many NLP tasks. Nevertheless, fine-tuned LMs are still prone to
robustness issues, such as adversarial robustness and model calibration.
Several perspectives of robustness for LMs have been studied independently, but
lacking a unified consideration in multiple perspectives. In this paper, we
propose Robustifying LMs via Adversarial perturbation with Selective Training
(RoAST), a simple yet effective fine-tuning technique to enhance the
multi-perspective robustness of LMs in a unified way. RoAST effectively
incorporates two important sources for the model robustness, robustness on the
perturbed inputs and generalizable knowledge in pre-trained LMs. To be
specific, RoAST introduces adversarial perturbation during fine-tuning while
the model parameters are selectively updated upon their relative importance to
minimize unnecessary deviation. Under a unified evaluation of fine-tuned LMs by
incorporating four representative perspectives of model robustness, we
demonstrate the effectiveness of RoAST compared to state-of-the-art fine-tuning
methods on six different types of LMs, which indicates its usefulness in
practice.
Related papers
- Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration [20.049443396032423]
Black-box large language models (LLMs) are increasingly deployed in various environments.
LLMs often exhibit overconfidence, leading to potential risks and misjudgments.
We propose a novel method, textitAtypical presentations Recalibration, which leverages atypical presentations to adjust the model's confidence estimates.
arXiv Detail & Related papers (2024-09-05T03:45:35Z) - Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models [21.929902181609936]
We propose a novel approach to integrate uncertainty-based active learning and LoRA.
For the uncertainty gap, we introduce a dynamic uncertainty measurement that combines the uncertainty of the base model and the uncertainty of the full model.
For poor model calibration, we incorporate the regularization method during LoRA training to keep the model from being over-confident.
arXiv Detail & Related papers (2024-03-02T10:38:10Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Evaluating Concurrent Robustness of Language Models Across Diverse Challenge Sets [46.19529338280716]
Language models, characterized by their black-box nature, often hallucinate and display sensitivity to input perturbations.
We introduce a methodology designed to examine how input perturbations affect language models across various scales.
We present three distinct fine-tuning strategies to address robustness against multiple perturbations.
arXiv Detail & Related papers (2023-11-15T02:59:10Z) - Analyzing Modality Robustness in Multimodal Sentiment Analysis [48.52878002917685]
Building robust multimodal models is crucial for achieving reliable deployment in the wild.
We propose simple diagnostic checks for modality robustness in a trained multimodal model.
We analyze well-known robust training strategies to alleviate the issues.
arXiv Detail & Related papers (2022-05-30T23:30:16Z) - NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
Models [54.184609286094044]
We propose noise entropy regularisation (NoiER) as an efficient learning paradigm that solves the problem without auxiliary models and additional data.
The proposed approach improved traditional OOD detection evaluation metrics by 55% on average compared to the original fine-tuned models.
arXiv Detail & Related papers (2021-08-29T06:58:28Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - A Closer Look at the Robustness of Vision-and-Language Pre-trained
Models [42.13369297087191]
Large-scale pre-trained multimodal transformers, such as ViLBERT and UNITER, have propelled the state of the art in vision-and-language (V+L) research to a new level.
Although achieving impressive performance on standard tasks, it still remains unclear how robust these pre-trained models are.
We propose Mango, a generic and efficient approach that learns a Multimodal Adversarial Noise GeneratOr in the embedding space to fool pre-trained V+L models.
arXiv Detail & Related papers (2020-12-15T23:41:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.