Improving Zero-shot Multilingual Neural Machine Translation by
Leveraging Cross-lingual Consistency Regularization
- URL: http://arxiv.org/abs/2305.07310v1
- Date: Fri, 12 May 2023 08:32:18 GMT
- Title: Improving Zero-shot Multilingual Neural Machine Translation by
Leveraging Cross-lingual Consistency Regularization
- Authors: Pengzhi Gao, Liwen Zhang, Zhongjun He, Hua Wu, Haifeng Wang
- Abstract summary: The multilingual neural machine translation (NMT) model has a promising capability of zero-shot translation.
This paper introduces a cross-lingual consistency regularization, CrossConST, to bridge the representation gap among different languages.
- Score: 46.09132547431629
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The multilingual neural machine translation (NMT) model has a promising
capability of zero-shot translation, where it could directly translate between
language pairs unseen during training. For good transfer performance from
supervised directions to zero-shot directions, the multilingual NMT model is
expected to learn universal representations across different languages. This
paper introduces a cross-lingual consistency regularization, CrossConST, to
bridge the representation gap among different languages and boost zero-shot
translation performance. The theoretical analysis shows that CrossConST
implicitly maximizes the probability distribution for zero-shot translation,
and the experimental results on both low-resource and high-resource benchmarks
show that CrossConST consistently improves the translation performance. The
experimental analysis also proves that CrossConST could close the sentence
representation gap and better align the representation space. Given the
universality and simplicity of CrossConST, we believe it can serve as a strong
baseline for future multilingual NMT research.
Related papers
- Towards Boosting Many-to-Many Multilingual Machine Translation with
Large Language Models [47.39529535727593]
This paper focuses on boosting many-to-many multilingual translation of large language models (LLMs) with an emphasis on zero-shot translation directions.
We introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages.
Experimental results on ALMA, Tower, and LLaMA-2 show that our approach consistently improves translation performance.
arXiv Detail & Related papers (2024-01-11T12:11:30Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Improving Zero-Shot Multilingual Translation with Universal
Representations and Cross-Mappings [23.910477693942905]
Improved zero-shot translation requires the model to learn universal representations and cross-mapping relationships.
We propose the state's distance based on the optimal theory to model the difference of the representations output by the encoder.
We propose an agreement-based training scheme, which can help the model make consistent predictions.
arXiv Detail & Related papers (2022-10-28T02:47:05Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Rethinking Zero-shot Neural Machine Translation: From a Perspective of
Latent Variables [28.101782382170306]
We introduce a denoising autoencoder objective based on pivot language into traditional training objective to improve the translation accuracy on zero-shot directions.
We demonstrate that the proposed method is able to effectively eliminate the spurious correlations and significantly outperforms state-of-the-art methods with a remarkable performance.
arXiv Detail & Related papers (2021-09-10T07:18:53Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.