Cross-lingual Retrieval for Iterative Self-Supervised Training
- URL: http://arxiv.org/abs/2006.09526v2
- Date: Mon, 26 Oct 2020 23:25:31 GMT
- Title: Cross-lingual Retrieval for Iterative Self-Supervised Training
- Authors: Chau Tran, Yuqing Tang, Xian Li, Jiatao Gu
- Abstract summary: Cross-lingual alignment can be further improved by training seq2seq models on sentence pairs mined using their own encoder outputs.
We develop a new approach -- cross-lingual retrieval for iterative self-supervised training.
- Score: 66.3329263451598
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have demonstrated the cross-lingual alignment ability of
multilingual pretrained language models. In this work, we found that the
cross-lingual alignment can be further improved by training seq2seq models on
sentence pairs mined using their own encoder outputs. We utilized these
findings to develop a new approach -- cross-lingual retrieval for iterative
self-supervised training (CRISS), where mining and training processes are
applied iteratively, improving cross-lingual alignment and translation ability
at the same time. Using this method, we achieved state-of-the-art unsupervised
machine translation results on 9 language directions with an average
improvement of 2.4 BLEU, and on the Tatoeba sentence retrieval task in the
XTREME benchmark on 16 languages with an average improvement of 21.5% in
absolute accuracy. Furthermore, CRISS also brings an additional 1.8 BLEU
improvement on average compared to mBART, when finetuned on supervised machine
translation downstream tasks.
Related papers
- Improved Cross-Lingual Transfer Learning For Automatic Speech
Translation [18.97234151624098]
We show that by initializing the encoder of the encoder-decoder sequence-to-sequence translation model with SAMU-XLS-R, we achieve significantly better cross-lingual task knowledge transfer.
We demonstrate the effectiveness of our approach on two popular datasets, namely, CoVoST-2 and Europarl.
arXiv Detail & Related papers (2023-06-01T15:19:06Z) - Improving Neural Machine Translation by Denoising Training [95.96569884410137]
We present a simple and effective pretraining strategy Denoising Training DoT for neural machine translation.
We update the model parameters with source- and target-side denoising tasks at the early stage and then tune the model normally.
Experiments show DoT consistently improves the neural machine translation performance across 12 bilingual and 16 multilingual directions.
arXiv Detail & Related papers (2022-01-19T00:11:38Z) - Consistency Regularization for Cross-Lingual Fine-Tuning [61.08704789561351]
We propose to improve cross-lingual fine-tuning with consistency regularization.
Specifically, we use example consistency regularization to penalize the prediction sensitivity to four types of data augmentations.
Experimental results on the XTREME benchmark show that our method significantly improves cross-lingual fine-tuning across various tasks.
arXiv Detail & Related papers (2021-06-15T15:35:44Z) - Improving Cross-Lingual Reading Comprehension with Self-Training [62.73937175625953]
Current state-of-the-art models even surpass human performance on several benchmarks.
Previous works have revealed the abilities of pre-trained multilingual models for zero-shot cross-lingual reading comprehension.
This paper further utilized unlabeled data to improve the performance.
arXiv Detail & Related papers (2021-05-08T08:04:30Z) - Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z) - English Intermediate-Task Training Improves Zero-Shot Cross-Lingual
Transfer Too [42.95481834479966]
Intermediatetask training improves model performance substantially on language understanding tasks in monolingual English settings.
We evaluate intermediate-task transfer in a zero-shot cross-lingual setting on the XTREME benchmark.
Using our best intermediate-task models for each target task, we obtain a 5.4 point improvement over XLM-R Large.
arXiv Detail & Related papers (2020-05-26T20:17:29Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.