FPT: Improving Prompt Tuning Efficiency via Progressive Training
- URL: http://arxiv.org/abs/2211.06840v1
- Date: Sun, 13 Nov 2022 08:00:29 GMT
- Title: FPT: Improving Prompt Tuning Efficiency via Progressive Training
- Authors: Yufei Huang, Yujia Qin, Huadong Wang, Yichun Yin, Maosong Sun, Zhiyuan
Liu and Qun Liu
- Abstract summary: We propose Fast Prompt Tuning to improve prompt tuning's training efficiency.
We show that FPT could save over 30% training computations while achieving comparable performance.
- Score: 84.25195519945215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, prompt tuning (PT) has gained increasing attention as a
parameter-efficient way of tuning pre-trained language models (PLMs). Despite
extensively reducing the number of tunable parameters and achieving satisfying
performance, PT is training-inefficient due to its slow convergence. To improve
PT's training efficiency, we first make some novel observations about the
prompt transferability of "partial PLMs", which are defined by compressing a
PLM in depth or width. We observe that the soft prompts learned by different
partial PLMs of various sizes are similar in the parameter space, implying that
these soft prompts could potentially be transferred among partial PLMs.
Inspired by these observations, we propose Fast Prompt Tuning (FPT), which
starts by conducting PT using a small-scale partial PLM, and then progressively
expands its depth and width until the full-model size. After each expansion, we
recycle the previously learned soft prompts as initialization for the enlarged
partial PLM and then proceed PT. We demonstrate the feasibility of FPT on 5
tasks and show that FPT could save over 30% training computations while
achieving comparable performance.
Related papers
- Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models [24.62337386603331]
Large Multi-modal Models (LMMs) are revolutionizing the way machines interact with the world.
To adapt LMMs for downstream tasks, parameter-efficient fine-tuning (PEFT) has gained popularity.
This paper focuses on the strengths and weaknesses of each tuning strategy, shifting the focus from the efficiency typically associated with these approaches.
arXiv Detail & Related papers (2024-10-29T07:55:50Z) - APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference [63.52244442498831]
Fine-tuning and inference with large Language Models (LMs) are generally known to be expensive.
We introduce APT that adaptively prunes and tunes parameters for the LMs.
We show that APT speeds up LMs fine-tuning by up to 8x and reduces large LMs memory training footprint by up to 70%.
arXiv Detail & Related papers (2024-01-22T18:39:40Z) - DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning [14.975436239088312]
We propose DePT, which decomposes the soft prompt into a shorter soft prompt and a pair of low-rank matrices that are then optimised with two different learning rates.
We demonstrate that DePT outperforms state-of-the-art PEFT approaches, including the full fine-tuning baseline, in some scenarios.
arXiv Detail & Related papers (2023-09-11T00:02:05Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts [97.20933523766182]
Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs)
We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers.
We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
arXiv Detail & Related papers (2022-10-20T14:23:52Z) - On Transferability of Prompt Tuning for Natural Language Understanding [63.29235426932978]
We investigate the transferability of soft prompts across different tasks and models.
We find that trained soft prompts can well transfer to similar tasks and initialize PT for them to accelerate training and improve performance.
Our findings show that improving PT with knowledge transfer is possible and promising, while prompts' cross-task transferability is generally better than the cross-model transferability.
arXiv Detail & Related papers (2021-11-12T13:39:28Z) - PPT: Pre-trained Prompt Tuning for Few-shot Learning [47.05554619258627]
Prompts for pre-trained language models (PLMs) have shown remarkable performance by bridging the gap between pre-training tasks and various downstream tasks.
Among these methods, prompt tuning, which freezes PLMs and only tunes soft prompts, provides an efficient and effective solution for adapting large-scale PLMs to downstream tasks.
In our work, we find that prompt tuning performs comparably with conventional full-model fine-tuning when downstream data are sufficient, whereas it performs much worse under few-shot learning settings.
arXiv Detail & Related papers (2021-09-09T15:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.