Effectively Prompting Small-sized Language Models for Cross-lingual Tasks via Winning Tickets
- URL: http://arxiv.org/abs/2404.01242v1
- Date: Mon, 1 Apr 2024 17:03:16 GMT
- Title: Effectively Prompting Small-sized Language Models for Cross-lingual Tasks via Winning Tickets
- Authors: Mingqi Li, Feng Luo,
- Abstract summary: Current soft prompt methods yield limited performance when applied to small-sized models.
Deep prompt-tuning entails prepending parameters in each prompt for enhanced efficacy.
We introduce the Lottery Ticket Prompt-learning framework that integrates winning tickets with soft prompts.
- Score: 2.803947848713182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current soft prompt methods yield limited performance when applied to small-sized models (fewer than a billion parameters). Deep prompt-tuning, which entails prepending parameters in each layer for enhanced efficacy, presents a solution for prompting small-sized models, albeit requiring carefully designed implementation. In this paper, we introduce the Lottery Ticket Prompt-learning (LTP) framework that integrates winning tickets with soft prompts. The LTP offers a simpler implementation and requires only a one-time execution. We demonstrate LTP on cross-lingual tasks, where prior works rely on external tools like human-designed multilingual templates and bilingual dictionaries, which may not be feasible in a low-resource regime. Specifically, we select a subset of parameters that have been changed the most during the fine-tuning with the Masked Language Modeling objective. Then, we prepend soft prompts to the original pre-trained language model and only update the selected parameters together with prompt-related parameters when adapting to the downstream tasks. We verify the effectiveness of our LTP framework on cross-lingual tasks, specifically targeting low-resource languages. Our approach outperforms the baselines by only updating 20\% of the original parameters.
Related papers
- Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in
Low-Resource Languages [0.0]
"prompting" is where a user provides a description of a task and some completed examples of the task to a PLM as context before prompting the PLM to perform the task on a new example.
We consider three methods: few-shot prompting (prompt), language-adaptive fine-tuning (LAFT), and neural machine translation (translate)
We find that translate and prompt settings are a compute-efficient and cost-effective method of few-shot prompting for the selected low-resource languages.
arXiv Detail & Related papers (2024-03-09T21:36:13Z) - On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based
Multilingual Model [49.81429697921861]
We study the interaction between parameter-efficient fine-tuning (PEFT) and cross-lingual tasks in multilingual autoregressive models.
We show that prompt tuning is more effective in enhancing the performance of low-resource languages than fine-tuning.
arXiv Detail & Related papers (2023-11-14T00:43:33Z) - One For All & All For One: Bypassing Hyperparameter Tuning with Model
Averaging For Cross-Lingual Transfer [61.455775535559276]
We propose an unsupervised evaluation protocol for ZS-XLT.
We run broad ZS-XLT experiments on both higher-level semantic tasks (NLI, extractive QA) and a lower-level token classification task (NER)
We find that conventional model selection based on source-language validation quickly plateaus to suboptimal ZS-XLT performance.
arXiv Detail & Related papers (2023-10-16T15:50:34Z) - Parameter-Efficient Cross-lingual Transfer of Vision and Language Models
via Translation-based Alignment [31.885608173448368]
Pre-trained vision and language models such as CLIP have witnessed remarkable success in connecting images and texts with a primary focus on English texts.
disparities in performance among different languages have been observed due to uneven resource availability.
We propose a new parameter-efficient cross-lingual transfer learning framework that utilizes a translation-based alignment method to mitigate multilingual disparities.
arXiv Detail & Related papers (2023-05-02T14:09:02Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Instance-wise Prompt Tuning for Pretrained Language Models [72.74916121511662]
Instance-wise Prompt Tuning (IPT) is the first prompt learning paradigm that injects knowledge from the input data instances to the prompts.
IPT significantly outperforms task-based prompt learning methods, and achieves comparable performance to conventional finetuning with only 0.5% - 1.5% of tuned parameters.
arXiv Detail & Related papers (2022-06-04T10:08:50Z) - HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both
Language and Vision-and-Language Tasks [38.43269863509866]
How to perform parameter-efficient fine-tuning has become fairly important for quick transfer learning and deployment.
We design a novel unified parameter-efficient transfer learning framework that works effectively on both pure language and V&L tasks.
Our proposed framework adds fewer trainable parameters in multi-task learning while achieving superior performances and transfer ability compared to state-of-the-art methods.
arXiv Detail & Related papers (2022-03-08T06:51:33Z) - Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified
Multilingual Prompt [98.26682501616024]
We propose a novel model that uses a unified prompt for all languages, called UniPrompt.
The unified prompt is computation by a multilingual PLM to produce language-independent representation.
Our proposed methods can significantly outperform the strong baselines across different languages.
arXiv Detail & Related papers (2022-02-23T11:57:52Z) - WARP: Word-level Adversarial ReProgramming [13.08689221166729]
In many applications it is preferable to tune much smaller sets of parameters, so that the majority of parameters can be shared across multiple tasks.
We present an alternative approach based on adversarial reprogramming, which extends earlier work on automatic prompt generation.
We show that this approach outperforms other methods with a similar number of trainable parameters on SST-2 and MNLI datasets.
arXiv Detail & Related papers (2021-01-01T00:41:03Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.