Improving Prompt Tuning with Learned Prompting Layers
- URL: http://arxiv.org/abs/2310.20127v1
- Date: Tue, 31 Oct 2023 02:07:51 GMT
- Title: Improving Prompt Tuning with Learned Prompting Layers
- Authors: Wei Zhu and Ming Tan
- Abstract summary: We propose a novel framework, underlineSelective underlinePrompt underlineTuning (SPT)
It learns to select the proper prompt layers by inserting a prompt controlled by a learnable probabilistic gate at each intermediate layer.
We conduct extensive experiments with ten benchmark datasets under the full-data and few-shot scenarios.
- Score: 12.46460062708119
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prompt tuning prepends a soft prompt to the input embeddings or hidden states
and only optimizes the prompt to adapt pretrained models (PTMs) to downstream
tasks. The previous work manually selects prompt layers which are far from
optimal and failed to exploit the potential of prompt tuning. In this work, we
propose a novel framework, \underline{S}elective \underline{P}rompt
\underline{T}uning (SPT), that learns to select the proper prompt layers by
inserting a prompt controlled by a learnable probabilistic gate at each
intermediate layer. We further propose a novel bi-level optimization framework,
SPT-DARTS, that can better optimize the learnable gates and improve the final
prompt tuning performances of the learned prompt layer settings. We conduct
extensive experiments with ten benchmark datasets under the full-data and
few-shot scenarios. The results demonstrate that our SPT framework can perform
better than the previous state-of-the-art PETuning baselines with comparable or
fewer tunable parameters.
Related papers
- Revisiting the Power of Prompt for Visual Tuning [50.11465784194896]
This study explores the correlation evolvement between prompts and patch tokens during proficient training.
Inspired by the observation that the prompt tokens tend to share high mutual information with patch tokens, we propose initializing prompts with downstream token prototypes.
Our method significantly advances the adaptation for self-supervised pretraining, achieving impressive task performance gains of at least 10% to 30%.
arXiv Detail & Related papers (2024-02-04T07:49:02Z) - Global Prompt Cell: A Portable Control Module for Effective Prompt
Tuning [16.76984489127912]
We introduce the Global Prompt Cell (GPC), a portable control module for prompt tuning.
Our experimental results demonstrate a 5.8% improvement on SuperGLUE datasets compared to vanilla prompt tuning.
arXiv Detail & Related papers (2023-04-12T06:46:33Z) - Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts [97.20933523766182]
Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs)
We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers.
We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
arXiv Detail & Related papers (2022-10-20T14:23:52Z) - STT: Soft Template Tuning for Few-Shot Adaptation [72.46535261444151]
We propose a new prompt-tuning framework, called Soft Template Tuning (STT)
STT combines manual and auto prompts, and treats downstream classification tasks as a masked language modeling task.
It can even outperform the time- and resource-consuming fine-tuning method on sentiment classification tasks.
arXiv Detail & Related papers (2022-07-18T07:07:22Z) - Learning a Better Initialization for Soft Prompts via Meta-Learning [58.53984967461313]
We propose MetaPT (Meta-learned Prompt Tuning) to improve prompt tuning.
We introduce the structure by first clustering pre-training data into different auxiliary tasks.
We use these tasks to pre-train prompts with a meta-learning algorithm.
arXiv Detail & Related papers (2022-05-25T03:50:23Z) - Structured Prompt Tuning [83.71253868369999]
Instead of prepending a sequence of tunable embeddings to the input, we generate the soft prompt embeddings through a hypernetwork.
Our approach subsumes the standard prompt tuning, allows more flexibility in model design and can be applied to both single-task and multi-task training settings.
arXiv Detail & Related papers (2022-05-24T18:36:34Z) - PPT: Pre-trained Prompt Tuning for Few-shot Learning [47.05554619258627]
Prompts for pre-trained language models (PLMs) have shown remarkable performance by bridging the gap between pre-training tasks and various downstream tasks.
Among these methods, prompt tuning, which freezes PLMs and only tunes soft prompts, provides an efficient and effective solution for adapting large-scale PLMs to downstream tasks.
In our work, we find that prompt tuning performs comparably with conventional full-model fine-tuning when downstream data are sufficient, whereas it performs much worse under few-shot learning settings.
arXiv Detail & Related papers (2021-09-09T15:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.