Improving Zero-Shot Multilingual Translation with Universal
Representations and Cross-Mappings
- URL: http://arxiv.org/abs/2210.15851v1
- Date: Fri, 28 Oct 2022 02:47:05 GMT
- Title: Improving Zero-Shot Multilingual Translation with Universal
Representations and Cross-Mappings
- Authors: Shuhao Gu, Yang Feng
- Abstract summary: Improved zero-shot translation requires the model to learn universal representations and cross-mapping relationships.
We propose the state's distance based on the optimal theory to model the difference of the representations output by the encoder.
We propose an agreement-based training scheme, which can help the model make consistent predictions.
- Score: 23.910477693942905
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The many-to-many multilingual neural machine translation can translate
between language pairs unseen during training, i.e., zero-shot translation.
Improving zero-shot translation requires the model to learn universal
representations and cross-mapping relationships to transfer the knowledge
learned on the supervised directions to the zero-shot directions. In this work,
we propose the state mover's distance based on the optimal theory to model the
difference of the representations output by the encoder. Then, we bridge the
gap between the semantic-equivalent representations of different languages at
the token level by minimizing the proposed distance to learn universal
representations. Besides, we propose an agreement-based training scheme, which
can help the model make consistent predictions based on the semantic-equivalent
sentences to learn universal cross-mapping relationships for all translation
directions. The experimental results on diverse multilingual datasets show that
our method can improve consistently compared with the baseline system and other
contrast methods. The analysis proves that our method can better align the
semantic space and improve the prediction consistency.
Related papers
- Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Improving Zero-shot Multilingual Neural Machine Translation by
Leveraging Cross-lingual Consistency Regularization [46.09132547431629]
The multilingual neural machine translation (NMT) model has a promising capability of zero-shot translation.
This paper introduces a cross-lingual consistency regularization, CrossConST, to bridge the representation gap among different languages.
arXiv Detail & Related papers (2023-05-12T08:32:18Z) - VECO 2.0: Cross-lingual Language Model Pre-training with
Multi-granularity Contrastive Learning [56.47303426167584]
We propose a cross-lingual pre-trained model VECO2.0 based on contrastive learning with multi-granularity alignments.
Specifically, the sequence-to-sequence alignment is induced to maximize the similarity of the parallel pairs and minimize the non-parallel pairs.
token-to-token alignment is integrated to bridge the gap between synonymous tokens excavated via the thesaurus dictionary from the other unpaired tokens in a bilingual instance.
arXiv Detail & Related papers (2023-04-17T12:23:41Z) - Robust Unsupervised Cross-Lingual Word Embedding using Domain Flow
Interpolation [48.32604585839687]
Previous adversarial approaches have shown promising results in inducing cross-lingual word embedding without parallel data.
We propose to make use of a sequence of intermediate spaces for smooth bridging.
arXiv Detail & Related papers (2022-10-07T04:37:47Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Rethinking Zero-shot Neural Machine Translation: From a Perspective of
Latent Variables [28.101782382170306]
We introduce a denoising autoencoder objective based on pivot language into traditional training objective to improve the translation accuracy on zero-shot directions.
We demonstrate that the proposed method is able to effectively eliminate the spurious correlations and significantly outperforms state-of-the-art methods with a remarkable performance.
arXiv Detail & Related papers (2021-09-10T07:18:53Z) - InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language
Model Pre-Training [135.12061144759517]
We present an information-theoretic framework that formulates cross-lingual language model pre-training.
We propose a new pre-training task based on contrastive learning.
By leveraging both monolingual and parallel corpora, we jointly train the pretext to improve the cross-lingual transferability of pre-trained models.
arXiv Detail & Related papers (2020-07-15T16:58:01Z) - Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences [18.19093600136057]
We propose a framework for extracting divergence patterns for any language pair from a parallel corpus.
We show that our framework provides a detailed picture of cross-language divergences, generalizes previous approaches, and lends itself to full automation.
arXiv Detail & Related papers (2020-05-07T13:05:03Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.