AmericasNLI: Evaluating Zero-shot Natural Language Understanding of
Pretrained Multilingual Models in Truly Low-resource Languages
- URL: http://arxiv.org/abs/2104.08726v1
- Date: Sun, 18 Apr 2021 05:32:28 GMT
- Title: AmericasNLI: Evaluating Zero-shot Natural Language Understanding of
Pretrained Multilingual Models in Truly Low-resource Languages
- Authors: Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Vishrav Chaudhary, Luis
Chiruzzo, Angela Fan, John Ortega, Ricardo Ramos, Annette Rios, Ivan
Vladimir, Gustavo A. Gim\'enez-Lugo, Elisabeth Mager, Graham Neubig, Alexis
Palmer, Rolando A. Coto Solano, Ngoc Thang Vu and Katharina Kann
- Abstract summary: We present AmericasNLI, an extension of XNLI (Conneau et al.), to 10 indigenous languages of the Americas.
We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches.
We find that XLM-R's zero-shot performance is poor for all 10 languages, with an average performance of 38.62%.
- Score: 75.08199398141744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pretrained multilingual models are able to perform cross-lingual transfer in
a zero-shot setting, even for languages unseen during pretraining. However,
prior work evaluating performance on unseen languages has largely been limited
to low-level, syntactic tasks, and it remains unclear if zero-shot learning of
high-level, semantic tasks is possible for unseen languages. To explore this
question, we present AmericasNLI, an extension of XNLI (Conneau et al., 2018)
to 10 indigenous languages of the Americas. We conduct experiments with XLM-R,
testing multiple zero-shot and translation-based approaches. Additionally, we
explore model adaptation via continued pretraining and provide an analysis of
the dataset by considering hypothesis-only models. We find that XLM-R's
zero-shot performance is poor for all 10 languages, with an average performance
of 38.62%. Continued pretraining offers improvements, with an average accuracy
of 44.05%. Surprisingly, training on poorly translated data by far outperforms
all other methods with an accuracy of 48.72%.
Related papers
- Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer [92.80671770992572]
Cross-lingual transfer is a central task in multilingual NLP.
Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data.
We propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2023-09-19T19:30:56Z) - Prompt-Tuning Can Be Much Better Than Fine-Tuning on Cross-lingual
Understanding With Multilingual Language Models [95.32691891392903]
In this paper, we do cross-lingual evaluation on various NLU tasks using prompt-tuning and compare it with fine-tuning.
The results show that prompt tuning achieves much better cross-lingual transfer than fine-tuning across datasets.
arXiv Detail & Related papers (2022-10-22T05:48:02Z) - Robustification of Multilingual Language Models to Real-world Noise with
Robust Contrastive Pretraining [14.087882550564169]
We assess the robustness of neural models on noisy data and suggest improvements are limited to the English language.
To benchmark the performance of pretrained multilingual models, we construct noisy datasets covering five languages and four NLP tasks.
We propose Robust Contrastive Pretraining (RCP) to boost the zero-shot cross-lingual robustness of multilingual pretrained models.
arXiv Detail & Related papers (2022-10-10T15:40:43Z) - Few-shot Learning with Multilingual Language Models [66.49496434282564]
We train multilingual autoregressive language models on a balanced corpus covering a diverse set of languages.
Our largest model sets new state of the art in few-shot learning in more than 20 representative languages.
We present a detailed analysis of where the model succeeds and fails, showing in particular that it enables cross-lingual in-context learning.
arXiv Detail & Related papers (2021-12-20T16:52:35Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - Boosting Cross-Lingual Transfer via Self-Learning with Uncertainty
Estimation [34.97086123805344]
Recent multilingual pre-trained language models have achieved remarkable zero-shot performance.
We propose a self-learning framework that further utilizes unlabeled data of target languages.
We evaluate our framework with uncertainties on two cross-lingual tasks including Named Entity Recognition (NER) and Natural Language Inference (NLI) covering 40 languages in total.
arXiv Detail & Related papers (2021-09-01T05:26:46Z) - Probing Multilingual Language Models for Discourse [0.0]
We find that the XLM-RoBERTa family of models consistently show the best performance.
Our results also indicate that model distillation may hurt the ability of cross-lingual transfer of sentence representations.
arXiv Detail & Related papers (2021-06-09T06:34:21Z) - Bilingual Alignment Pre-training for Zero-shot Cross-lingual Transfer [33.680292990007366]
In this paper, we aim to improve the zero-shot cross-lingual transfer performance by aligning the embeddings better.
We propose a pre-training task named Alignment Language Model (AlignLM) which uses the statistical alignment information as the prior knowledge to guide bilingual word prediction.
The results show AlignLM can improve the zero-shot performance significantly on MLQA and XNLI datasets.
arXiv Detail & Related papers (2021-06-03T10:18:43Z) - Self-Learning for Zero Shot Neural Machine Translation [13.551731309506874]
This work proposes a novel zero-shot NMT modeling approach that learns without the now-standard assumption of a pivot language sharing parallel data.
Compared to unsupervised NMT, consistent improvements are observed even in a domain-mismatch setting.
arXiv Detail & Related papers (2021-03-10T09:15:19Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.