Related papers: FPT: Improving Prompt Tuning Efficiency via Progressive Training

FPT: Improving Prompt Tuning Efficiency via Progressive Training

URL: http://arxiv.org/abs/2211.06840v1
Date: Sun, 13 Nov 2022 08:00:29 GMT
Title: FPT: Improving Prompt Tuning Efficiency via Progressive Training
Authors: Yufei Huang, Yujia Qin, Huadong Wang, Yichun Yin, Maosong Sun, Zhiyuan Liu and Qun Liu
Abstract summary: We propose Fast Prompt Tuning to improve prompt tuning's training efficiency. We show that FPT could save over 30% training computations while achieving comparable performance.
Score: 84.25195519945215
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, prompt tuning (PT) has gained increasing attention as a parameter-efficient way of tuning pre-trained language models (PLMs). Despite extensively reducing the number of tunable parameters and achieving satisfying performance, PT is training-inefficient due to its slow convergence. To improve PT's training efficiency, we first make some novel observations about the prompt transferability of "partial PLMs", which are defined by compressing a PLM in depth or width. We observe that the soft prompts learned by different partial PLMs of various sizes are similar in the parameter space, implying that these soft prompts could potentially be transferred among partial PLMs. Inspired by these observations, we propose Fast Prompt Tuning (FPT), which starts by conducting PT using a small-scale partial PLM, and then progressively expands its depth and width until the full-model size. After each expansion, we recycle the previously learned soft prompts as initialization for the enlarged partial PLM and then proceed PT. We demonstrate the feasibility of FPT on 5 tasks and show that FPT could save over 30% training computations while achieving comparable performance.

Related papers

Skip Tuning: Pre-trained Vision-Language Models are Effective and Efficient Adapters Themselves [123.07450481623124]
We propose Skip Tuning as a novel paradigm for adapting vision-language models to downstream tasks. Unlike existing PT or adapter-based methods, Skip Tuning applies Layer-wise Skipping (LSkip) and Class-wise Skipping (CSkip) upon the FT baseline without introducing extra context vectors or adapter modules.
arXiv Detail & Related papers (2024-12-16T07:33:23Z)
Preserving Pre-trained Representation Space: On Effectiveness of Prefix-tuning for Large Multi-modal Models [24.62337386603331]
Large Multi-modal Models (LMMs) are revolutionizing the way machines interact with the world. To adapt LMMs for downstream tasks, parameter-efficient fine-tuning (PEFT) has gained popularity. This paper focuses on the strengths and weaknesses of each tuning strategy, shifting the focus from the efficiency typically associated with these approaches.
arXiv Detail & Related papers (2024-10-29T07:55:50Z)
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference [63.52244442498831]
Fine-tuning and inference with large Language Models (LMs) are generally known to be expensive. We introduce APT that adaptively prunes and tunes parameters for the LMs. We show that APT speeds up LMs fine-tuning by up to 8x and reduces large LMs memory training footprint by up to 70%.
arXiv Detail & Related papers (2024-01-22T18:39:40Z)
DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning [14.975436239088312]
We propose DePT, which decomposes the soft prompt into a shorter soft prompt and a pair of low-rank matrices that are then optimised with two different learning rates. We demonstrate that DePT outperforms state-of-the-art PEFT approaches, including the full fine-tuning baseline, in some scenarios.
arXiv Detail & Related papers (2023-09-11T00:02:05Z)
Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks. We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z)
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts [97.20933523766182]
Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs) We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers. We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
arXiv Detail & Related papers (2022-10-20T14:23:52Z)
On Transferability of Prompt Tuning for Natural Language Understanding [63.29235426932978]
We investigate the transferability of soft prompts across different tasks and models. We find that trained soft prompts can well transfer to similar tasks and initialize PT for them to accelerate training and improve performance. Our findings show that improving PT with knowledge transfer is possible and promising, while prompts' cross-task transferability is generally better than the cross-model transferability.
arXiv Detail & Related papers (2021-11-12T13:39:28Z)
PPT: Pre-trained Prompt Tuning for Few-shot Learning [47.05554619258627]
Prompts for pre-trained language models (PLMs) have shown remarkable performance by bridging the gap between pre-training tasks and various downstream tasks. Among these methods, prompt tuning, which freezes PLMs and only tunes soft prompts, provides an efficient and effective solution for adapting large-scale PLMs to downstream tasks. In our work, we find that prompt tuning performs comparably with conventional full-model fine-tuning when downstream data are sufficient, whereas it performs much worse under few-shot learning settings.
arXiv Detail & Related papers (2021-09-09T15:11:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.