Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in
Low-Resource Languages
- URL: http://arxiv.org/abs/2403.06018v1
- Date: Sat, 9 Mar 2024 21:36:13 GMT
- Title: Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in
Low-Resource Languages
- Authors: Christopher Toukmaji
- Abstract summary: "prompting" is where a user provides a description of a task and some completed examples of the task to a PLM as context before prompting the PLM to perform the task on a new example.
We consider three methods: few-shot prompting (prompt), language-adaptive fine-tuning (LAFT), and neural machine translation (translate)
We find that translate and prompt settings are a compute-efficient and cost-effective method of few-shot prompting for the selected low-resource languages.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large pre-trained language models (PLMs) are at the forefront of advances in
Natural Language Processing. One widespread use case of PLMs is "prompting" -
or in-context learning - where a user provides a description of a task and some
completed examples of the task to a PLM as context before prompting the PLM to
perform the task on a new example. Only the largest, most capable PLMs are able
to perform in-context learning effectively, and these models are typically
trained with a predominantly English corpus, leaving all other languages
behind. The data limitations in most languages preclude the training of
language-specific PLMs capable of prompting. Albeit the surge in work of
prompting settings, it is still unclear how PLMs should be adapted
cross-lingually specifically for prompting. We evaluate the possible methods to
adapt LLaMa, a 7B parameter open-source PLM mainly trained in English, for
prompting in low-resource languages, namely for Kinyarwanda, Hausa, and
Luganda. We consider three methods: few-shot prompting (prompt),
language-adaptive fine-tuning (LAFT), and neural machine translation
(translate), and evaluate on abstractive summarization, multi-class topic
classification, and named-entity recognition. Although LAFT carries the
greatest compute cost and intuitively should lead to the best results, our
experiments exhibit that LAFT is only occasionally the optimal choice for
adapting PLMs for prompting. Rather, the translate and prompt settings are a
compute-efficient and cost-effective method of few-shot prompting for the
selected low-resource languages. We find that the results are task and language
dependent but find that the prompting method is the best on average across all
tasks and languages. Results show that the prompt setting performs better than
both translating and LAFT with statistical significance for all shots when
aggregated across all tasks and languages.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - Machine Translation with Large Language Models: Prompt Engineering for
Persian, English, and Russian Directions [0.0]
Generative large language models (LLMs) have demonstrated exceptional proficiency in various natural language processing (NLP) tasks.
We conducted an investigation into two popular prompting methods and their combination, focusing on cross-language combinations of Persian, English, and Russian.
arXiv Detail & Related papers (2024-01-16T15:16:34Z) - PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning [46.153828074152436]
We propose a pivot language guided generation approach to enhance instruction tuning in lower-resource languages.
It trains the model to first process instructions in the pivot language, and then produce responses in the target language.
Our approach demonstrates a significant improvement in the instruction-following abilities of LLMs by 29% on average.
arXiv Detail & Related papers (2023-11-15T05:28:07Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Multilingual Relation Classification via Efficient and Effective
Prompting [9.119073318043952]
We present the first work on prompt-based multilingual relation classification (RC)
We introduce an efficient and effective method that constructs prompts from relation triples and involves only minimal translation for the class labels.
We evaluate its performance in fully supervised, few-shot and zero-shot scenarios, and analyze its effectiveness across 14 languages.
arXiv Detail & Related papers (2022-10-25T08:40:23Z) - Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified
Multilingual Prompt [98.26682501616024]
We propose a novel model that uses a unified prompt for all languages, called UniPrompt.
The unified prompt is computation by a multilingual PLM to produce language-independent representation.
Our proposed methods can significantly outperform the strong baselines across different languages.
arXiv Detail & Related papers (2022-02-23T11:57:52Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.