Nearest Neighbour Few-Shot Learning for Cross-lingual Classification
- URL: http://arxiv.org/abs/2109.02221v1
- Date: Mon, 6 Sep 2021 03:18:23 GMT
- Title: Nearest Neighbour Few-Shot Learning for Cross-lingual Classification
- Authors: M Saiful Bari, Batool Haider, Saab Mansour
- Abstract summary: Cross-lingual adaptation using a simple nearest neighbor few-shot (15 samples) inference technique for classification tasks.
Our approach consistently improves traditional fine-tuning using only a handful of labeled samples in target locales.
- Score: 2.578242050187029
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Even though large pre-trained multilingual models (e.g. mBERT, XLM-R) have
led to significant performance gains on a wide range of cross-lingual NLP
tasks, success on many downstream tasks still relies on the availability of
sufficient annotated data. Traditional fine-tuning of pre-trained models using
only a few target samples can cause over-fitting. This can be quite limiting as
most languages in the world are under-resourced. In this work, we investigate
cross-lingual adaptation using a simple nearest neighbor few-shot (<15 samples)
inference technique for classification tasks. We experiment using a total of 16
distinct languages across two NLP tasks- XNLI and PAWS-X. Our approach
consistently improves traditional fine-tuning using only a handful of labeled
samples in target locales. We also demonstrate its generalization capability
across tasks.
Related papers
- DeMuX: Data-efficient Multilingual Learning [57.37123046817781]
DEMUX is a framework that prescribes exact data-points to label from vast amounts of unlabelled multilingual data.
Our end-to-end framework is language-agnostic, accounts for model representations, and supports multilingual target configurations.
arXiv Detail & Related papers (2023-11-10T20:09:08Z) - X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity [19.15213046428148]
Cross-lingual transfer (XLT) is an ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process.
We propose the utilization of sub-network similarity between two languages as a proxy for predicting the compatibility of the languages in the context of XLT.
arXiv Detail & Related papers (2023-10-26T05:39:49Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Multilingual Relation Classification via Efficient and Effective
Prompting [9.119073318043952]
We present the first work on prompt-based multilingual relation classification (RC)
We introduce an efficient and effective method that constructs prompts from relation triples and involves only minimal translation for the class labels.
We evaluate its performance in fully supervised, few-shot and zero-shot scenarios, and analyze its effectiveness across 14 languages.
arXiv Detail & Related papers (2022-10-25T08:40:23Z) - Por Qu\'e N\~ao Utiliser Alla Spr{\aa}k? Mixed Training with Gradient
Optimization in Few-Shot Cross-Lingual Transfer [2.7213511121305465]
We propose a one-step mixed training method that trains on both source and target data.
We use one model to handle all target languages simultaneously to avoid excessively language-specific models.
Our proposed method achieves state-of-the-art performance on all tasks and outperforms target-adapting by a large margin.
arXiv Detail & Related papers (2022-04-29T04:05:02Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - From Good to Best: Two-Stage Training for Cross-lingual Machine Reading
Comprehension [51.953428342923885]
We develop a two-stage approach to enhance the model performance.
The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer.
The second stage focuses on precision: an answer-aware contrastive learning mechanism is developed to learn the fine difference between the accurate answer and other candidates.
arXiv Detail & Related papers (2021-12-09T07:31:15Z) - AmericasNLI: Evaluating Zero-shot Natural Language Understanding of
Pretrained Multilingual Models in Truly Low-resource Languages [75.08199398141744]
We present AmericasNLI, an extension of XNLI (Conneau et al.), to 10 indigenous languages of the Americas.
We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches.
We find that XLM-R's zero-shot performance is poor for all 10 languages, with an average performance of 38.62%.
arXiv Detail & Related papers (2021-04-18T05:32:28Z) - Towards Zero-Shot Multilingual Synthetic Question and Answer Generation
for Cross-Lingual Reading Comprehension [20.570539023748424]
We propose a simple method to generate multilingual question and answer pairs on a large scale.
These synthetic samples can be used to improve the zero-shot performance of multilingual QA models on target languages.
arXiv Detail & Related papers (2020-10-22T19:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.