Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering
for Zero-Shot Cross-Lingual Transfer and Generation
- URL: http://arxiv.org/abs/2203.10250v1
- Date: Sat, 19 Mar 2022 05:22:07 GMT
- Title: Meta-X$_{NLG}$: A Meta-Learning Approach Based on Language Clustering
for Zero-Shot Cross-Lingual Transfer and Generation
- Authors: Kaushal Kumar Maurya and Maunendra Sankar Desarkar
- Abstract summary: This paper proposes a novel meta-learning framework to learn shareable structures from typologically diverse languages.
We first cluster the languages based on language representations and identify the centroid language of each cluster.
A meta-learning algorithm is trained with all centroid languages and evaluated on the other languages in the zero-shot setting.
- Score: 11.155430893354769
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recently, the NLP community has witnessed a rapid advancement in multilingual
and cross-lingual transfer research where the supervision is transferred from
high-resource languages (HRLs) to low-resource languages (LRLs). However, the
cross-lingual transfer is not uniform across languages, particularly in the
zero-shot setting. Towards this goal, one promising research direction is to
learn shareable structures across multiple tasks with limited annotated data.
The downstream multilingual applications may benefit from such a learning setup
as most of the languages across the globe are low-resource and share some
structures with other languages. In this paper, we propose a novel
meta-learning framework (called Meta-X$_{NLG}$) to learn shareable structures
from typologically diverse languages based on meta-learning and language
clustering. This is a step towards uniform cross-lingual transfer for unseen
languages. We first cluster the languages based on language representations and
identify the centroid language of each cluster. Then, a meta-learning algorithm
is trained with all centroid languages and evaluated on the other languages in
the zero-shot setting. We demonstrate the effectiveness of this modeling on two
NLG tasks (Abstractive Text Summarization and Question Generation), 5 popular
datasets and 30 typologically diverse languages. Consistent improvements over
strong baselines demonstrate the efficacy of the proposed framework. The
careful design of the model makes this end-to-end NLG setup less vulnerable to
the accidental translation problem, which is a prominent concern in zero-shot
cross-lingual NLG tasks.
Related papers
- xCoT: Cross-lingual Instruction Tuning for Cross-lingual
Chain-of-Thought Reasoning [36.34986831526529]
Chain-of-thought (CoT) has emerged as a powerful technique to elicit reasoning in large language models.
We propose a cross-lingual instruction fine-tuning framework (xCOT) to transfer knowledge from high-resource languages to low-resource languages.
arXiv Detail & Related papers (2024-01-13T10:53:53Z) - Soft Language Clustering for Multilingual Model Pre-training [57.18058739931463]
We propose XLM-P, which contextually retrieves prompts as flexible guidance for encoding instances conditionally.
Our XLM-P enables (1) lightweight modeling of language-invariant and language-specific knowledge across languages, and (2) easy integration with other multilingual pre-training methods.
arXiv Detail & Related papers (2023-06-13T08:08:08Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - Multi-level Contrastive Learning for Cross-lingual Spoken Language
Understanding [90.87454350016121]
We develop novel code-switching schemes to generate hard negative examples for contrastive learning at all levels.
We develop a label-aware joint model to leverage label semantics for cross-lingual knowledge transfer.
arXiv Detail & Related papers (2022-05-07T13:44:28Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z) - Zero-Shot Cross-Lingual Transfer with Meta Learning [45.29398184889296]
We consider the setting of training models on multiple languages at the same time, when little or no data is available for languages other than English.
We show that this challenging setup can be approached using meta-learning.
We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks.
arXiv Detail & Related papers (2020-03-05T16:07:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.