XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
Understanding and Generation
- URL: http://arxiv.org/abs/2004.01401v3
- Date: Fri, 22 May 2020 05:58:10 GMT
- Title: XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
Understanding and Generation
- Authors: Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi,
Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang,
Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung
Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming
Zhou
- Abstract summary: XGLUE is a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models.
XGLUE provides 11 diversified tasks that cover both natural language understanding and generation scenarios.
- Score: 100.09099800591822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce XGLUE, a new benchmark dataset that can be used
to train large-scale cross-lingual pre-trained models using multilingual and
bilingual corpora and evaluate their performance across a diverse set of
cross-lingual tasks. Comparing to GLUE(Wang et al., 2019), which is labeled in
English for natural language understanding tasks only, XGLUE has two main
advantages: (1) it provides 11 diversified tasks that cover both natural
language understanding and generation scenarios; (2) for each task, it provides
labeled data in multiple languages. We extend a recent cross-lingual
pre-trained model Unicoder(Huang et al., 2019) to cover both understanding and
generation tasks, which is evaluated on XGLUE as a strong baseline. We also
evaluate the base versions (12-layer) of Multilingual BERT, XLM and XLM-R for
comparison.
Related papers
- X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity [19.15213046428148]
Cross-lingual transfer (XLT) is an ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process.
We propose the utilization of sub-network similarity between two languages as a proxy for predicting the compatibility of the languages in the context of XLT.
arXiv Detail & Related papers (2023-10-26T05:39:49Z) - XNLI 2.0: Improving XNLI dataset and performance on Cross Lingual
Understanding (XLU) [0.0]
We focus on improving the original XNLI dataset by re-translating the MNLI dataset in all of the 14 different languages present in XNLI.
We also perform experiments by training models in all 15 languages and analyzing their performance on the task of natural language inference.
arXiv Detail & Related papers (2023-01-16T17:24:57Z) - Bridging Cross-Lingual Gaps During Leveraging the Multilingual
Sequence-to-Sequence Pretraining for Text Generation [80.16548523140025]
We extend the vanilla pretrain-finetune pipeline with extra code-switching restore task to bridge the gap between the pretrain and finetune stages.
Our approach could narrow the cross-lingual sentence representation distance and improve low-frequency word translation with trivial computational cost.
arXiv Detail & Related papers (2022-04-16T16:08:38Z) - Multilingual Transfer Learning for QA Using Translation as Data
Augmentation [13.434957024596898]
We explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in the semantic space.
We propose two novel strategies, language adversarial training and language arbitration framework, which significantly improve the (zero-resource) cross-lingual transfer performance.
Empirically, we show that the proposed models outperform the previous zero-shot baseline on the recently introduced multilingual MLQA and TyDiQA datasets.
arXiv Detail & Related papers (2020-12-10T20:29:34Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - FILTER: An Enhanced Fusion Method for Cross-lingual Language
Understanding [85.29270319872597]
We propose an enhanced fusion method that takes cross-lingual data as input for XLM finetuning.
During inference, the model makes predictions based on the text input in the target language and its translation in the source language.
To tackle this issue, we propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language.
arXiv Detail & Related papers (2020-09-10T22:42:15Z) - XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning [68.57658225995966]
Cross-lingual Choice of Plausible Alternatives (XCOPA) is a typologically diverse multilingual dataset for causal commonsense reasoning in 11 languages.
We evaluate a range of state-of-the-art models on this novel dataset, revealing that the performance of current methods falls short compared to translation-based transfer.
arXiv Detail & Related papers (2020-05-01T12:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.