Related papers: Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

URL: http://arxiv.org/abs/2305.03937v1
Date: Sat, 6 May 2023 05:35:14 GMT
Title: Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization
Authors: Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi
Abstract summary: Residual Prompt Tuning is a simple and efficient method that significantly improves the performance and stability of prompt tuning. We show that our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance.
Score: 57.379285443780894
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute <0.1% of total parameters), it typically performs worse than other efficient tuning methods and is quite sensitive to hyper-parameters. In this work, we introduce Residual Prompt Tuning - a simple and efficient method that significantly improves the performance and stability of prompt tuning. We propose to reparameterize soft prompt embeddings using a shallow network with a residual connection. Our experiments show that Residual Prompt Tuning significantly outperforms prompt tuning on SuperGLUE benchmark. Notably, our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance. In addition, we show that our approach is robust to the choice of learning rate and prompt initialization, and is effective in few-shot settings.

Related papers

LoPT: Low-Rank Prompt Tuning for Parameter Efficient Language Models [2.380819994407948]
Prompt tuning is significantly more parameter-efficient than model fine-tuning. We propose Low-rank Prompt Tuning (LoPT), a low-rank model for prompts that achieves efficient prompt optimization.
arXiv Detail & Related papers (2024-06-27T19:02:41Z)
SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings [0.7349727826230863]
Soft prompt tuning techniques have gained traction as an effective strategy for the parameter-efficient tuning of pretrained language models. We introduce SuperPos-Prompt, a new re parameterization technique employing the superposition of multiple pretrained vocabulary embeddings to improve the learning of soft prompts. Our experiments consistently highlight SuperPos-Prompt's superiority over Residual Prompt tuning, exhibiting an average score increase of $+6.4$ in T5-Small and $+5.0$ in T5-Base. Remarkably, SuperPos-Prompt occasionally outperforms even full fine-tuning methods.
arXiv Detail & Related papers (2024-06-07T22:18:49Z)
Attention Prompt Tuning: Parameter-efficient Adaptation of Pre-trained Models for Spatiotemporal Modeling [32.603558214472265]
We introduce Attention Prompt Tuning (APT) for video-based applications such as action recognition. APT involves injecting a set of learnable prompts along with data tokens during fine-tuning while keeping the backbone frozen. The proposed approach greatly reduces the number of FLOPs and latency while achieving a significant performance boost.
arXiv Detail & Related papers (2024-03-11T17:59:41Z)
E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation. Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z)
Parameter-Efficient Fine-Tuning without Introducing New Latency [7.631596468553607]
We introduce a novel adapter technique that directly applies the adapter to pre-trained parameters instead of the hidden representation. Our proposed method attains a new state-of-the-art outcome in terms of both performance and storage efficiency, storing only 0.03% parameters of full fine-tuning.
arXiv Detail & Related papers (2023-05-26T08:44:42Z)
PTP: Boosting Stability and Performance of Prompt Tuning with Perturbation-Based Regularizer [94.23904400441957]
We introduce perturbation-based regularizers, which can smooth the loss landscape, into prompt tuning. We design two kinds of perturbation-based regularizers, including random-noise-based and adversarial-based. Our new algorithms improve the state-of-the-art prompt tuning methods by 1.94% and 2.34% on SuperGLUE and FewGLUE benchmarks, respectively.
arXiv Detail & Related papers (2023-05-03T20:30:51Z)
Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual. sensuous-aware fine-Tuning (SPT) scheme. SPT allocates trainable parameters to task-specific important positions. Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z)
Late Prompt Tuning: A Late Prompt Could Be Better Than Many Prompts [97.20933523766182]
Prompt tuning is a parameter-efficient tuning (PETuning) method for utilizing pre-trained models (PTMs) We present Late Prompt Tuning () that inserts a late prompt into an intermediate layer of the PTM instead of the input layer or all layers. We show that, can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios.
arXiv Detail & Related papers (2022-10-20T14:23:52Z)
Structured Prompt Tuning [83.71253868369999]
Instead of prepending a sequence of tunable embeddings to the input, we generate the soft prompt embeddings through a hypernetwork. Our approach subsumes the standard prompt tuning, allows more flexibility in model design and can be applied to both single-task and multi-task training settings.
arXiv Detail & Related papers (2022-05-24T18:36:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.