Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts
- URL: http://arxiv.org/abs/2210.11292v2
- Date: Fri, 21 Oct 2022 07:46:31 GMT
- Title: Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts
- Authors: Xiangyang Liu, Tianxiang Sun, Xuanjing Huang, Xipeng Qiu
- Abstract summary: Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs)
We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers.
We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
- Score: 97.20933523766182
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing
pre-trained models (PTMs) that simply prepends a soft prompt to the input and
only optimizes the prompt to adapt PTMs to downstream tasks. Although it is
parameter- and deployment-efficient, its performance still lags behind other
state-of-the-art PETuning methods. Besides, the training cost of prompt tuning
is not significantly reduced due to the back-propagation through the entire
model. Through empirical analyses, we shed some light on the lagging
performance of prompt tuning and recognize a trade-off between the propagation
distance from label signals to the inserted prompt and the influence of the
prompt on model outputs. Further, we present Late Prompt Tuning (LPT) that
inserts a late prompt into an intermediate layer of the PTM instead of the
input layer or all layers. The late prompt is obtained by a neural prompt
generator conditioned on the hidden states before the prompt insertion layer
and therefore is instance-dependent. Through extensive experimental results
across various tasks and PTMs, we show that LPT can achieve competitive
performance to full model tuning and other PETuning methods under both
full-data and few-shot scenarios while possessing faster training speed and
lower memory cost.
Related papers
- Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained
Models for Spatiotemporal Modeling [32.603558214472265]
We introduce Attention Prompt Tuning (APT) for video-based applications such as action recognition.
APT involves injecting a set of learnable prompts along with data tokens during fine-tuning while keeping the backbone frozen.
The proposed approach greatly reduces the number of FLOPs and latency while achieving a significant performance boost.
arXiv Detail & Related papers (2024-03-11T17:59:41Z) - Improving Prompt Tuning with Learned Prompting Layers [12.46460062708119]
We propose a novel framework, underlineSelective underlinePrompt underlineTuning (SPT)
It learns to select the proper prompt layers by inserting a prompt controlled by a learnable probabilistic gate at each intermediate layer.
We conduct extensive experiments with ten benchmark datasets under the full-data and few-shot scenarios.
arXiv Detail & Related papers (2023-10-31T02:07:51Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - FPT: Improving Prompt Tuning Efficiency via Progressive Training [84.25195519945215]
We propose Fast Prompt Tuning to improve prompt tuning's training efficiency.
We show that FPT could save over 30% training computations while achieving comparable performance.
arXiv Detail & Related papers (2022-11-13T08:00:29Z) - XPrompt: Exploring the Extreme of Prompt Tuning [31.242680485717447]
We propose a novel Prompt tuning model with an eXtremely small scale (XPrompt) under the regime of lottery tickets hypothesis.
XPrompt eliminates the negative prompt tokens at different levels through a hierarchical structured pruning, yielding a more parameter-efficient prompt yet with a competitive performance.
arXiv Detail & Related papers (2022-10-10T06:57:19Z) - STT: Soft Template Tuning for Few-Shot Adaptation [72.46535261444151]
We propose a new prompt-tuning framework, called Soft Template Tuning (STT)
STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task.
It can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.
arXiv Detail & Related papers (2022-07-18T07:07:22Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - PPT: Pre-trained Prompt Tuning for Few-shot Learning [47.05554619258627]
Prompts for pre-trained language models (PLMs) have shown remarkable performance by bridging the gap between pre-training tasks and various downstream tasks.
Among these methods, prompt tuning, which freezes PLMs and only tunes soft prompts, provides an efficient and effective solution for adapting large-scale PLMs to downstream tasks.
In our work, we find that prompt tuning performs comparably with conventional full-model fine-tuning when downstream data are sufficient, whereas it performs much worse under few-shot learning settings.
arXiv Detail & Related papers (2021-09-09T15:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.