ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language
Generation
- URL: http://arxiv.org/abs/2106.01597v1
- Date: Thu, 3 Jun 2021 05:08:01 GMT
- Title: ZmBART: An Unsupervised Cross-lingual Transfer Framework for Language
Generation
- Authors: Kaushal Kumar Maurya, Maunendra Sankar Desarkar, Yoshinobu Kano and
Kumari Deepshikha
- Abstract summary: Cross-lingual transfer for natural language generation is relatively understudied.
We consider four NLG tasks (text summarization, question generation, news headline generation, and distractor generation) and three syntactically diverse languages.
We propose an unsupervised cross-lingual language generation framework (called ZmBART) that does not use any parallel or pseudo-parallel/back-translated data.
- Score: 4.874780144224057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the recent advancement in NLP research, cross-lingual transfer for
natural language generation is relatively understudied. In this work, we
transfer supervision from high resource language (HRL) to multiple low-resource
languages (LRLs) for natural language generation (NLG). We consider four NLG
tasks (text summarization, question generation, news headline generation, and
distractor generation) and three syntactically diverse languages, i.e.,
English, Hindi, and Japanese. We propose an unsupervised cross-lingual language
generation framework (called ZmBART) that does not use any parallel or
pseudo-parallel/back-translated data. In this framework, we further pre-train
mBART sequence-to-sequence denoising auto-encoder model with an auxiliary task
using monolingual data of three languages. The objective function of the
auxiliary task is close to the target tasks which enriches the multi-lingual
latent representation of mBART and provides good initialization for target
tasks. Then, this model is fine-tuned with task-specific supervised English
data and directly evaluated with low-resource languages in the Zero-shot
setting. To overcome catastrophic forgetting and spurious correlation issues,
we applied freezing model component and data argumentation approaches
respectively. This simple modeling approach gave us promising results.We
experimented with few-shot training (with 1000 supervised data points) which
boosted the model performance further. We performed several ablations and
cross-lingual transferability analyses to demonstrate the robustness of ZmBART.
Related papers
- Key ingredients for effective zero-shot cross-lingual knowledge transfer in generative tasks [22.93790760274486]
Zero-shot cross-lingual knowledge transfer enables a multilingual pretrained language model, finetuned on a task in one language, make predictions for this task in other languages.
Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model.
In this work we compare various approaches proposed from the literature in unified settings, also including alternative backbone models, namely mBART and NLLB-200.
arXiv Detail & Related papers (2024-02-19T16:43:57Z) - Efficiently Aligned Cross-Lingual Transfer Learning for Conversational
Tasks using Prompt-Tuning [98.60739735409243]
Cross-lingual transfer of language models trained on high-resource languages like English has been widely studied for many NLP tasks.
We introduce XSGD for cross-lingual alignment pretraining, a parallel and large-scale multilingual conversation dataset.
To facilitate aligned cross-lingual representations, we develop an efficient prompt-tuning-based method for learning alignment prompts.
arXiv Detail & Related papers (2023-04-03T18:46:01Z) - CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual
Labeled Sequence Translation [113.99145386490639]
Cross-lingual NER can transfer knowledge between languages via aligned cross-lingual representations or machine translation results.
We propose a Cross-lingual Entity Projection framework (CROP) to enable zero-shot cross-lingual NER.
We adopt a multilingual labeled sequence translation model to project the tagged sequence back to the target language and label the target raw sentence.
arXiv Detail & Related papers (2022-10-13T13:32:36Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and
Languages [87.5457337866383]
We introduce the Image-Grounded Language Understanding Evaluation benchmark.
IGLUE brings together visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages.
We find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks.
arXiv Detail & Related papers (2022-01-27T18:53:22Z) - From Masked Language Modeling to Translation: Non-English Auxiliary
Tasks Improve Zero-shot Spoken Language Understanding [24.149299722716155]
We introduce xSID, a new benchmark for cross-lingual Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect.
We propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer.
Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification.
arXiv Detail & Related papers (2021-05-15T23:51:11Z) - XeroAlign: Zero-Shot Cross-lingual Transformer Alignment [9.340611077939828]
We introduce a method for task-specific alignment of cross-lingual pretrained transformers such as XLM-R.
XeroAlign uses translated task data to encourage the model to generate similar sentence embeddings for different languages.
XLM-RA's text classification accuracy exceeds that of XLM-R trained with labelled data and performs on par with state-of-the-art models on a cross-lingual adversarial paraphrasing task.
arXiv Detail & Related papers (2021-05-06T07:10:00Z) - Cross-lingual Machine Reading Comprehension with Language Branch
Knowledge Distillation [105.41167108465085]
Cross-lingual Machine Reading (CLMRC) remains a challenging problem due to the lack of large-scale datasets in low-source languages.
We propose a novel augmentation approach named Language Branch Machine Reading (LBMRC)
LBMRC trains multiple machine reading comprehension (MRC) models proficient in individual language.
We devise a multilingual distillation approach to amalgamate knowledge from multiple language branch models to a single model for all target languages.
arXiv Detail & Related papers (2020-10-27T13:12:17Z) - CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot
Cross-Lingual NLP [68.2650714613869]
We propose a data augmentation framework to generate multi-lingual code-switching data to fine-tune mBERT.
Compared with the existing work, our method does not rely on bilingual sentences for training, and requires only one training process for multiple target languages.
arXiv Detail & Related papers (2020-06-11T13:15:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.