Related papers: SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings

SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings

URL: http://arxiv.org/abs/2406.05279v1
Date: Fri, 7 Jun 2024 22:18:49 GMT
Title: SuperPos-Prompt: Enhancing Soft Prompt Tuning of Language Models with Superposition of Multi Token Embeddings
Authors: MohammadAli SadraeiJavaeri, Ehsaneddin Asgari, Alice Carolyn McHardy, Hamid Reza Rabiee,
Abstract summary: Soft prompt tuning techniques have gained traction as an effective strategy for the parameter-efficient tuning of pretrained language models. We introduce SuperPos-Prompt, a new re parameterization technique employing the superposition of multiple pretrained vocabulary embeddings to improve the learning of soft prompts. Our experiments consistently highlight SuperPos-Prompt's superiority over Residual Prompt tuning, exhibiting an average score increase of $+6.4$ in T5-Small and $+5.0$ in T5-Base. Remarkably, SuperPos-Prompt occasionally outperforms even full fine-tuning methods.
Score: 0.7349727826230863
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Soft prompt tuning techniques have recently gained traction as an effective strategy for the parameter-efficient tuning of pretrained language models, particularly minimizing the required adjustment of model parameters. Despite their growing use, achieving optimal tuning with soft prompts, especially for smaller datasets, remains a substantial challenge. This study makes two contributions in this domain: (i) we introduce SuperPos-Prompt, a new reparameterization technique employing the superposition of multiple pretrained vocabulary embeddings to improve the learning of soft prompts. Our experiments across several GLUE and SuperGLUE benchmarks consistently highlight SuperPos-Prompt's superiority over Residual Prompt tuning, exhibiting an average score increase of $+6.4$ in T5-Small and $+5.0$ in T5-Base along with a faster convergence. Remarkably, SuperPos-Prompt occasionally outperforms even full fine-tuning methods. (ii) Additionally, we demonstrate enhanced performance and rapid convergence by omitting dropouts from the frozen network, yielding consistent improvements across various scenarios and tuning methods.

Related papers

Large Language Models Prompting With Episodic Memory [53.8690170372303]
We propose PrOmpting with Episodic Memory (POEM), a novel prompt optimization technique that is simple, efficient, and demonstrates strong generalization capabilities. In the testing phase, we optimize the sequence of examples for each test query by selecting the sequence that yields the highest total rewards from the top-k most similar training examples in the episodic memory. Our results show that POEM outperforms recent techniques like TEMPERA and RLPrompt by over 5.3% in various text classification tasks.
arXiv Detail & Related papers (2024-08-14T11:19:28Z)
MoPE: Parameter-Efficient and Scalable Multimodal Fusion via Mixture of Prompt Experts [29.46189153751869]
We introduce the mixture of prompt experts (MoPE) technique to enhance the expressiveness of prompt tuning. Our method achieves state-of-the-art results for prompt fusion, matching or even surpassing the performance of fine-tuning.
arXiv Detail & Related papers (2024-03-14T17:47:10Z)
E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning [55.50908600818483]
Fine-tuning large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. We propose an Effective and Efficient Visual Prompt Tuning (E2VPT) approach for large-scale transformer-based model adaptation. Our approach outperforms several state-of-the-art baselines on two benchmarks.
arXiv Detail & Related papers (2023-07-25T19:03:21Z)
Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges. Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning. A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z)
Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization [57.379285443780894]
Residual Prompt Tuning is a simple and efficient method that significantly improves the performance and stability of prompt tuning. We show that our method reaches +7 points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by 10x without hurting performance.
arXiv Detail & Related papers (2023-05-06T05:35:14Z)
Unified Vision and Language Prompt Learning [86.1530128487077]
We present a systematic study on two representative prompt tuning methods, namely text prompt tuning and visual prompt tuning. A major finding is that text prompt tuning fails on data with high intra-class visual variances while visual prompt tuning cannot handle low inter-class variances. To combine the best from both worlds, we propose a simple approach called Unified Prompt Tuning (UPT), which essentially learns a tiny neural network to jointly optimize prompts across different modalities.
arXiv Detail & Related papers (2022-10-13T17:50:24Z)
Simple and Effective Gradient-Based Tuning of Sequence-to-Sequence Models [8.370770440898454]
Huge cost of training larger language models can make tuning them prohibitively expensive. We apply gradient-based hyper- parameter optimization to sequence-to-sequence tasks for the first time. We show efficiency and performance gains over strong baselines for both Neural Machine Translation and Natural Language Understanding (NLU) tasks.
arXiv Detail & Related papers (2022-09-10T14:52:41Z)
Prompt Tuning for Generative Multimodal Pretrained Models [75.44457974275154]
We implement prompt tuning on the unified sequence-to-sequence pretrained model adaptive to both understanding and generation tasks. Experimental results demonstrate that the light-weight prompt tuning can achieve comparable performance with finetuning. In comparison with finetuned models, the prompt-tuned models demonstrate improved robustness against adversarial attacks.
arXiv Detail & Related papers (2022-08-04T08:56:38Z)
The Power of Prompt Tuning for Low-Resource Semantic Parsing [10.37371743879877]
We investigate prompt tuning for semantic parsing. For large T5 models we find (i.e. that prompt tuning significantly outperforms fine-tuning in the low data regime) This last result is surprising as it suggests that large T5 models can be modulated to generate sequences far from the pre-training distribution.
arXiv Detail & Related papers (2021-10-16T09:33:09Z)
The Power of Scale for Parameter-Efficient Prompt Tuning [4.481348281462904]
"prompt tuning" is a simple mechanism for learning "soft prompts" to condition frozen language models to perform specific downstream tasks. Our end-to-end learned approach outperforms GPT-3's "few-shot" learning by a large margin.
arXiv Detail & Related papers (2021-04-18T03:19:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.