P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally
Across Scales and Tasks
- URL: http://arxiv.org/abs/2110.07602v2
- Date: Mon, 18 Oct 2021 17:57:15 GMT
- Title: P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally
Across Scales and Tasks
- Authors: Xiao Liu, Kaixuan Ji, Yicheng Fu, Zhengxiao Du, Zhilin Yang, Jie Tang
- Abstract summary: We present a novel empirical finding that properly optimized prompt tuning can be universally effective across a wide range of model scales and NLU tasks.
We believe P-Tuning v2 can serve as an alternative to fine-tuning and a strong baseline for future research.
- Score: 17.93703302601565
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt tuning, which only tunes continuous prompts with a frozen language
model, substantially reduces per-task storage and memory usage at training.
However, in the context of NLU, prior work reveals that prompt tuning does not
perform well for normal-sized pre-trained models. We also find that existing
methods of prompt tuning cannot handle hard sequence tagging tasks, indicating
a lack of universality. We present a novel empirical finding that properly
optimized prompt tuning can be universally effective across a wide range of
model scales and NLU tasks. It matches the performance of fine-tuning while
having only 0.1\%-3\% tuned parameters. Our method P-Tuning v2 is not a new
method, but a version of prefix-tuning \cite{li2021prefix} optimized and
adapted for NLU. Given the universality and simplicity of P-Tuning v2, we
believe it can serve as an alternative to fine-tuning and a strong baseline for
future research.
Related papers
- PTP: Boosting Stability and Performance of Prompt Tuning with
Perturbation-Based Regularizer [94.23904400441957]
We introduce perturbation-based regularizers, which can smooth the loss landscape, into prompt tuning.
We design two kinds of perturbation-based regularizers, including random-noise-based and adversarial-based.
Our new algorithms improve the state-of-the-art prompt tuning methods by 1.94% and 2.34% on SuperGLUE and FewGLUE benchmarks, respectively.
arXiv Detail & Related papers (2023-05-03T20:30:51Z) - SPT: Semi-Parametric Prompt Tuning for Multitask Prompted Learning [28.29889045842277]
Multitask prompted learning can help generalization through a diverse set of tasks at once.
We propose SPT, a semi-parametric prompt tuning method for multitask prompted learning.
arXiv Detail & Related papers (2022-12-21T11:18:09Z) - Two-stage LLM Fine-tuning with Less Specialization and More
Generalization [93.12197594813378]
We propose Prompt Tuning with MOdel Tuning (ProMoT) to reduce format specialization and improve generalization.
ProMoT offloads task-specific format learning into additional and removable parameters by first doing prompt tuning and then fine-tuning the model itself with this soft prompt.
ProMoT can even enhance generalization on in-context learning tasks that are semantically related to the fine-tuned task.
arXiv Detail & Related papers (2022-11-01T17:56:57Z) - Prompt Tuning for Generative Multimodal Pretrained Models [75.44457974275154]
We implement prompt tuning on the unified sequence-to-sequence pretrained model adaptive to both understanding and generation tasks.
Experimental results demonstrate that the light-weight prompt tuning can achieve comparable performance with finetuning.
In comparison with finetuned models, the prompt-tuned models demonstrate improved robustness against adversarial attacks.
arXiv Detail & Related papers (2022-08-04T08:56:38Z) - STT: Soft Template Tuning for Few-Shot Adaptation [72.46535261444151]
We propose a new prompt-tuning framework, called Soft Template Tuning (STT)
STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task.
It can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.
arXiv Detail & Related papers (2022-07-18T07:07:22Z) - BBTv2: Pure Black-Box Optimization Can Be Comparable to Gradient Descent
for Few-Shot Learning [83.26610968655815]
Black-Box Tuning is a derivative-free approach to optimize continuous prompt tokens prepended to the input of language models.
We present BBTv2, a pure black-box optimization approach that can drive language models to achieve comparable results to gradient-based optimization.
arXiv Detail & Related papers (2022-05-23T11:10:19Z) - GPT Understands, Too [42.701765107498346]
We propose a novel method P-Tuning that employs trainable continuous prompt embeddings in concatenation with discrete prompts.
P-Tuning is generally effective for both frozen and tuned language models, under both the fully-supervised and few-shot settings.
arXiv Detail & Related papers (2021-03-18T17:13:50Z) - Prefix-Tuning: Optimizing Continuous Prompts for Generation [85.6357778621526]
Fine-tuning is the de facto way to leverage large pretrained language models to perform downstream tasks.
We propose prefix-tuning, a lightweight alternative to fine-tuning for natural language generation tasks.
We find that by learning only 0.1% of the parameters, prefix-tuning obtains comparable performance in the full data setting.
arXiv Detail & Related papers (2021-01-01T08:00:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.