Robust Lottery Tickets for Pre-trained Language Models
- URL: http://arxiv.org/abs/2211.03013v1
- Date: Sun, 6 Nov 2022 02:59:27 GMT
- Title: Robust Lottery Tickets for Pre-trained Language Models
- Authors: Rui Zheng, Rong Bao, Yuhao Zhou, Di Liang, Sirui Wang, Wei Wu, Tao
Gui, Qi Zhang, Xuanjing Huang
- Abstract summary: We propose a novel method based on learning binary weight masks to identify robust tickets hidden in the original language models.
Experimental results show the significant improvement of the proposed method over previous work on adversarial robustness evaluation.
- Score: 57.14316619360376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent works on Lottery Ticket Hypothesis have shown that pre-trained
language models (PLMs) contain smaller matching subnetworks(winning tickets)
which are capable of reaching accuracy comparable to the original models.
However, these tickets are proved to be notrobust to adversarial examples, and
even worse than their PLM counterparts. To address this problem, we propose a
novel method based on learning binary weight masks to identify robust tickets
hidden in the original PLMs. Since the loss is not differentiable for the
binary mask, we assign the hard concrete distribution to the masks and
encourage their sparsity using a smoothing approximation of L0
regularization.Furthermore, we design an adversarial loss objective to guide
the search for robust tickets and ensure that the tickets perform well bothin
accuracy and robustness. Experimental results show the significant improvement
of the proposed method over previous work on adversarial robustness evaluation.
Related papers
- Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens [1.2549198550400134]
Large language models (LLMs) are extensively used, but there are concerns regarding privacy, security, and copyright due to their opaque training data.
Current solutions to this problem leverage techniques explored in machine learning privacy such as Membership Inference Attacks (MIAs)
We propose an adaptive pre-training data detection method which alleviates this reliance and effectively amplify the identification.
arXiv Detail & Related papers (2024-07-30T23:43:59Z) - CR-UTP: Certified Robustness against Universal Text Perturbations on Large Language Models [12.386141652094999]
Existing certified robustness based on random smoothing has shown considerable promise in certifying the input-specific text perturbations.
A naive method is to simply increase the masking ratio and the likelihood of masking attack tokens, but it leads to a significant reduction in both certified accuracy and the certified radius.
We introduce a novel approach, designed to identify a superior prompt that maintains higher certified accuracy under extensive masking.
arXiv Detail & Related papers (2024-06-04T01:02:22Z) - Token-Level Adversarial Prompt Detection Based on Perplexity Measures
and Contextual Information [67.78183175605761]
Large Language Models are susceptible to adversarial prompt attacks.
This vulnerability underscores a significant concern regarding the robustness and reliability of LLMs.
We introduce a novel approach to detecting adversarial prompts at a token level.
arXiv Detail & Related papers (2023-11-20T03:17:21Z) - Masked Images Are Counterfactual Samples for Robust Fine-tuning [77.82348472169335]
Fine-tuning deep learning models can lead to a trade-off between in-distribution (ID) performance and out-of-distribution (OOD) robustness.
We propose a novel fine-tuning method, which uses masked images as counterfactual samples that help improve the robustness of the fine-tuning model.
arXiv Detail & Related papers (2023-03-06T11:51:28Z) - Dual Lottery Ticket Hypothesis [71.95937879869334]
Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity.
In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark.
We propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH.
arXiv Detail & Related papers (2022-03-08T18:06:26Z) - On Lottery Tickets and Minimal Task Representations in Deep
Reinforcement Learning [0.0]
We show that feed-forward networks trained via supervised policy distillation and reinforcement learning can be pruned to the same level of sparsity.
Using a set of carefully designed baseline conditions, we find that the majority of the lottery ticket effect in reinforcement learning can be attributed to the identified mask.
arXiv Detail & Related papers (2021-05-04T17:47:39Z) - ELECTRA: Pre-training Text Encoders as Discriminators Rather Than
Generators [108.3381301768299]
Masked language modeling (MLM) pre-training methods such as BERT corrupt the input by replacing some tokens with [MASK] and then train a model to reconstruct the original tokens.
We propose a more sample-efficient pre-training task called replaced token detection.
arXiv Detail & Related papers (2020-03-23T21:17:42Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.