Modelling Latent Translations for Cross-Lingual Transfer
- URL: http://arxiv.org/abs/2107.11353v1
- Date: Fri, 23 Jul 2021 17:11:27 GMT
- Title: Modelling Latent Translations for Cross-Lingual Transfer
- Authors: Edoardo Maria Ponti, Julia Kreutzer, Ivan Vuli\'c, and Siva Reddy
- Abstract summary: We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
- Score: 47.61502999819699
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While achieving state-of-the-art results in multiple tasks and languages,
translation-based cross-lingual transfer is often overlooked in favour of
massively multilingual pre-trained encoders. Arguably, this is due to its main
limitations: 1) translation errors percolating to the classification phase and
2) the insufficient expressiveness of the maximum-likelihood translation. To
remedy this, we propose a new technique that integrates both steps of the
traditional pipeline (translation and classification) into a single model, by
treating the intermediate translations as a latent random variable. As a
result, 1) the neural machine translation system can be fine-tuned with a
variant of Minimum Risk Training where the reward is the accuracy of the
downstream task classifier. Moreover, 2) multiple samples can be drawn to
approximate the expected loss across all possible translations during
inference. We evaluate our novel latent translation-based model on a series of
multilingual NLU tasks, including commonsense reasoning, paraphrase
identification, and natural language inference. We report gains for both
zero-shot and few-shot learning setups, up to 2.7 accuracy points on average,
which are even more prominent for low-resource languages (e.g., Haitian
Creole). Finally, we carry out in-depth analyses comparing different underlying
NMT models and assessing the impact of alternative translations on the
downstream performance.
Related papers
- Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Revisiting Machine Translation for Cross-lingual Classification [91.43729067874503]
Most research in the area focuses on the multilingual models rather than the Machine Translation component.
We show that, by using a stronger MT system and mitigating the mismatch between training on original text and running inference on machine translated text, translate-test can do substantially better than previously assumed.
arXiv Detail & Related papers (2023-05-23T16:56:10Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Rethinking Zero-shot Neural Machine Translation: From a Perspective of
Latent Variables [28.101782382170306]
We introduce a denoising autoencoder objective based on pivot language into traditional training objective to improve the translation accuracy on zero-shot directions.
We demonstrate that the proposed method is able to effectively eliminate the spurious correlations and significantly outperforms state-of-the-art methods with a remarkable performance.
arXiv Detail & Related papers (2021-09-10T07:18:53Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - Verdi: Quality Estimation and Error Detection for Bilingual [23.485380293716272]
Verdi is a novel framework for word-level and sentence-level post-editing effort estimation for bilingual corpora.
We exploit the symmetric nature of bilingual corpora and apply model-level dual learning in the NMT predictor.
Our method beats the winner of the competition and outperforms other baseline methods by a great margin.
arXiv Detail & Related papers (2021-05-31T11:04:13Z) - Detecting Fine-Grained Cross-Lingual Semantic Divergences without
Supervision by Learning to Rank [28.910206570036593]
This work improves the prediction and annotation of fine-grained semantic divergences.
We introduce a training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity.
Learning to rank helps detect fine-grained sentence-level divergences more accurately than a strong sentence-level similarity model.
arXiv Detail & Related papers (2020-10-07T21:26:20Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.