Zero-Shot Cross-Lingual Transfer with Meta Learning
- URL: http://arxiv.org/abs/2003.02739v4
- Date: Mon, 5 Oct 2020 13:18:00 GMT
- Title: Zero-Shot Cross-Lingual Transfer with Meta Learning
- Authors: Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle
Augenstein
- Abstract summary: We consider the setting of training models on multiple languages at the same time, when little or no data is available for languages other than English.
We show that this challenging setup can be approached using meta-learning.
We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks.
- Score: 45.29398184889296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning what to share between tasks has been a topic of great importance
recently, as strategic sharing of knowledge has been shown to improve
downstream task performance. This is particularly important for multilingual
applications, as most languages in the world are under-resourced. Here, we
consider the setting of training models on multiple different languages at the
same time, when little or no data is available for languages other than
English. We show that this challenging setup can be approached using
meta-learning, where, in addition to training a source language model, another
model learns to select which training instances are the most beneficial to the
first. We experiment using standard supervised, zero-shot cross-lingual, as
well as few-shot cross-lingual settings for different natural language
understanding tasks (natural language inference, question answering). Our
extensive experimental setup demonstrates the consistent effectiveness of
meta-learning for a total of 15 languages. We improve upon the state-of-the-art
for zero-shot and few-shot NLI (on MultiNLI and XNLI) and QA (on the MLQA
dataset). A comprehensive error analysis indicates that the correlation of
typological features between languages can partly explain when parameter
sharing learned via meta-learning is beneficial.
Related papers
- Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Languages You Know Influence Those You Learn: Impact of Language
Characteristics on Multi-Lingual Text-to-Text Transfer [4.554080966463776]
Multi-lingual language models (LM) have been remarkably successful in enabling natural language tasks in low-resource languages.
We try to better understand how such models, specifically mT5, transfer *any* linguistic and semantic knowledge across languages.
A key finding of this work is that similarity of syntax, morphology and phonology are good predictors of cross-lingual transfer.
arXiv Detail & Related papers (2022-12-04T07:22:21Z) - Cross-lingual Lifelong Learning [53.06904052325966]
We present a principled Cross-lingual Continual Learning (CCL) evaluation paradigm.
We provide insights into what makes multilingual sequential learning particularly challenging.
The implications of this analysis include a recipe for how to measure and balance different cross-lingual continual learning desiderata.
arXiv Detail & Related papers (2022-05-23T09:25:43Z) - Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering
for Zero-Shot Cross-Lingual Transfer and Generation [11.155430893354769]
This paper proposes a novel meta-learning framework to learn shareable structures from typologically diverse languages.
We first cluster the languages based on language representations and identify the centroid language of each cluster.
A meta-learning algorithm is trained with all centroid languages and evaluated on the other languages in the zero-shot setting.
arXiv Detail & Related papers (2022-03-19T05:22:07Z) - Analysing The Impact Of Linguistic Features On Cross-Lingual Transfer [3.299672391663527]
We analyze a state-of-the-art multilingual model and try to determine what impacts good transfer between languages.
We show that looking at particular syntactic features is 2-4 times more helpful in predicting the performance than an aggregated syntactic similarity.
arXiv Detail & Related papers (2021-05-12T21:22:58Z) - X-METRA-ADA: Cross-lingual Meta-Transfer Learning Adaptation to Natural
Language Understanding and Question Answering [55.57776147848929]
We propose X-METRA-ADA, a cross-lingual MEta-TRAnsfer learning ADAptation approach for Natural Language Understanding (NLU)
Our approach adapts MAML, an optimization-based meta-learning approach, to learn to adapt to new languages.
We show that our approach outperforms naive fine-tuning, reaching competitive performance on both tasks for most languages.
arXiv Detail & Related papers (2021-04-20T00:13:35Z) - Meta-learning for fast cross-lingual adaptation in dependency parsing [16.716440467483096]
We apply model-agnostic meta-learning to the task of cross-lingual dependency parsing.
We find that meta-learning with pre-training can significantly improve upon the performance of language transfer.
arXiv Detail & Related papers (2021-04-10T11:10:16Z) - Meta-Learning for Effective Multi-task and Multilingual Modelling [23.53779501937046]
We propose a meta-learning approach to learn the interactions between both tasks and languages.
We present experiments on five different tasks and six different languages from the XTREME multilingual benchmark dataset.
arXiv Detail & Related papers (2021-01-25T19:30:26Z) - Learning to Scale Multilingual Representations for Vision-Language Tasks [51.27839182889422]
The effectiveness of SMALR is demonstrated with ten diverse languages, over twice the number supported in vision-language tasks to date.
We evaluate on multilingual image-sentence retrieval and outperform prior work by 3-4% with less than 1/5th the training parameters compared to other word embedding methods.
arXiv Detail & Related papers (2020-04-09T01:03:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.