Multilingual LLMs are Better Cross-lingual In-context Learners with
Alignment
- URL: http://arxiv.org/abs/2305.05940v3
- Date: Sat, 24 Jun 2023 07:43:35 GMT
- Title: Multilingual LLMs are Better Cross-lingual In-context Learners with
Alignment
- Authors: Eshaan Tanwar, Subhabrata Dutta, Manish Borthakur, Tanmoy Chakraborty
- Abstract summary: In-context learning (ICL) unfolds as large language models become capable of inferring test labels conditioned on a few labeled samples without any gradient update.
We provide the first in-depth analysis of ICL for cross-lingual text classification.
We propose a novel prompt construction strategy -- Cross-lingual In-context Source-Target Alignment (X-InSTA)
- Score: 24.742581572364124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning (ICL) unfolds as large language models become capable of
inferring test labels conditioned on a few labeled samples without any gradient
update. ICL-enabled large language models provide a promising step forward
toward bypassing recurrent annotation costs in a low-resource setting. Yet,
only a handful of past studies have explored ICL in a cross-lingual setting, in
which the need for transferring label-knowledge from a high-resource language
to a low-resource one is immensely crucial. To bridge the gap, we provide the
first in-depth analysis of ICL for cross-lingual text classification. We find
that the prevalent mode of selecting random input-label pairs to construct the
prompt-context is severely limited in the case of cross-lingual ICL, primarily
due to the lack of alignment in the input as well as the output spaces. To
mitigate this, we propose a novel prompt construction strategy -- Cross-lingual
In-context Source-Target Alignment (X-InSTA). With an injected coherence in the
semantics of the input examples and a task-based alignment across the source
and target languages, X-InSTA is able to outperform random prompt selection by
a large margin across three different tasks using 44 different cross-lingual
pairs.
Related papers
- Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarization [108.6908427615402]
Cross-lingual summarization ( CLS) aims to generate a summary for the source text in a different target language.
Currently, instruction-tuned large language models (LLMs) excel at various English tasks.
Recent studies have shown that LLMs' performance on CLS tasks remains unsatisfactory even with few-shot settings.
arXiv Detail & Related papers (2024-10-26T00:39:44Z) - Cross-lingual Back-Parsing: Utterance Synthesis from Meaning Representation for Zero-Resource Semantic Parsing [6.074150063191985]
Cross-Lingual Back-Parsing is a novel data augmentation methodology designed to enhance cross-lingual transfer for semantic parsing.
Our methodology effectively performs cross-lingual data augmentation in challenging zero-resource settings.
arXiv Detail & Related papers (2024-10-01T08:53:38Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Cross-lingual QA: A Key to Unlocking In-context Cross-lingual Performance [2.371686365695081]
Cross-lingual QA is a cross-lingual prompting method that translates only the question and answer parts, thus reducing translation costs.
Experiments on four typologically diverse multilingual benchmarks show that Cross-lingual QA effectively stimulates models to elicit their cross-lingual knowledge.
We show that prompting open-source MLLMs with cross-lingual in-context examples enhances performance as the model scale increases.
arXiv Detail & Related papers (2023-05-24T15:14:49Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - A Multi-level Supervised Contrastive Learning Framework for Low-Resource
Natural Language Inference [54.678516076366506]
Natural Language Inference (NLI) is a growingly essential task in natural language understanding.
Here we propose a multi-level supervised contrastive learning framework named MultiSCL for low-resource natural language inference.
arXiv Detail & Related papers (2022-05-31T05:54:18Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - LAReQA: Language-agnostic answer retrieval from a multilingual pool [29.553907688813347]
LAReQA tests for "strong" cross-lingual alignment.
We find that augmenting training data via machine translation is effective.
This finding underscores our claim that languageagnostic retrieval is a substantively new kind of cross-lingual evaluation.
arXiv Detail & Related papers (2020-04-11T20:51:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.