Soft Prompt Tuning for Cross-Lingual Transfer: When Less is More
- URL: http://arxiv.org/abs/2402.03782v1
- Date: Tue, 6 Feb 2024 07:52:30 GMT
- Title: Soft Prompt Tuning for Cross-Lingual Transfer: When Less is More
- Authors: Fred Philippy, Siwen Guo, Shohreh Haddadan, Cedric Lothritz, Jacques
Klein, Tegawend\'e F. Bissyand\'e
- Abstract summary: Soft Prompt Tuning (SPT) is a parameter-efficient method for adapting pre-trained language models to specific tasks.
This paper investigates the potential of SPT for cross-lingual transfer.
- Score: 9.230338573494622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Soft Prompt Tuning (SPT) is a parameter-efficient method for adapting
pre-trained language models (PLMs) to specific tasks by inserting learnable
embeddings, or soft prompts, at the input layer of the PLM, without modifying
its parameters. This paper investigates the potential of SPT for cross-lingual
transfer. Unlike previous studies on SPT for cross-lingual transfer that often
fine-tune both the soft prompt and the model parameters, we adhere to the
original intent of SPT by keeping the model parameters frozen and only training
the soft prompt. This does not only reduce the computational cost and storage
overhead of full-model fine-tuning, but we also demonstrate that this very
parameter efficiency intrinsic to SPT can enhance cross-lingual transfer
performance to linguistically distant languages. Moreover, we explore how
different factors related to the prompt, such as the length or its
reparameterization, affect cross-lingual transfer performance.
Related papers
- Effectively Prompting Small-sized Language Models for Cross-lingual Tasks via Winning Tickets [2.803947848713182]
Current soft prompt methods yield limited performance when applied to small-sized models.
Deep prompt-tuning entails prepending parameters in each prompt for enhanced efficacy.
We introduce the Lottery Ticket Prompt-learning framework that integrates winning tickets with soft prompts.
arXiv Detail & Related papers (2024-04-01T17:03:16Z) - DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning [14.975436239088312]
We propose DePT, which decomposes the soft prompt into a shorter soft prompt and a pair of low-rank matrices that are then optimised with two different learning rates.
We demonstrate that DePT outperforms state-of-the-art PEFT approaches, including the full fine-tuning baseline, in some scenarios.
arXiv Detail & Related papers (2023-09-11T00:02:05Z) - Approximated Prompt Tuning for Vision-Language Pre-trained Models [54.326232586461614]
In vision-language pre-trained models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks.
We propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning.
arXiv Detail & Related papers (2023-06-27T05:43:47Z) - Sensitivity-Aware Visual Parameter-Efficient Fine-Tuning [91.5113227694443]
We propose a novel visual.
sensuous-aware fine-Tuning (SPT) scheme.
SPT allocates trainable parameters to task-specific important positions.
Experiments on a wide range of downstream recognition tasks show that our SPT is complementary to the existing PEFT methods.
arXiv Detail & Related papers (2023-03-15T12:34:24Z) - Evaluating Parameter-Efficient Transfer Learning Approaches on SURE
Benchmark for Speech Understanding [40.27182770995891]
Fine-tuning is widely used as the default algorithm for transfer learning from pre-trained models.
We introduce the Speech UndeRstanding Evaluation (SURE) benchmark for parameter-efficient learning for various speech-processing tasks.
arXiv Detail & Related papers (2023-03-02T08:57:33Z) - How Does In-Context Learning Help Prompt Tuning? [55.78535874154915]
Fine-tuning large language models is becoming ever more impractical due to their rapidly-growing scale.
This motivates the use of parameter-efficient adaptation methods such as prompt tuning (PT), which adds a small number of tunable embeddings to an otherwise frozen model.
Recently, Singhal et al. (2022) propose instruction prompt tuning'' (IPT), which combines PT with ICL by concatenating a natural language demonstration with learned prompt embeddings.
arXiv Detail & Related papers (2023-02-22T17:45:12Z) - FPT: Improving Prompt Tuning Efficiency via Progressive Training [84.25195519945215]
We propose Fast Prompt Tuning to improve prompt tuning's training efficiency.
We show that FPT could save over 30% training computations while achieving comparable performance.
arXiv Detail & Related papers (2022-11-13T08:00:29Z) - Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual
Retrieval [66.69799641522133]
State-of-the-art neural (re)rankers are notoriously data hungry.
Current approaches typically transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders.
We show that two parameter-efficient approaches to cross-lingual transfer, namely Sparse Fine-Tuning Masks (SFTMs) and Adapters, allow for a more lightweight and more effective zero-shot transfer.
arXiv Detail & Related papers (2022-04-05T15:44:27Z) - On Transferability of Prompt Tuning for Natural Language Understanding [63.29235426932978]
We investigate the transferability of soft prompts across different tasks and models.
We find that trained soft prompts can well transfer to similar tasks and initialize PT for them to accelerate training and improve performance.
Our findings show that improving PT with knowledge transfer is possible and promising, while prompts' cross-task transferability is generally better than the cross-model transferability.
arXiv Detail & Related papers (2021-11-12T13:39:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.