Flow-Adapter Architecture for Unsupervised Machine Translation
- URL: http://arxiv.org/abs/2204.12225v1
- Date: Tue, 26 Apr 2022 11:00:32 GMT
- Title: Flow-Adapter Architecture for Unsupervised Machine Translation
- Authors: Yihong Liu, Haris Jabbar, Hinrich Sch\"utze
- Abstract summary: We propose a flow-adapter architecture for unsupervised NMT.
We leverage normalizing flows to explicitly model the distributions of sentence-level latent representations.
This architecture allows for unsupervised training of each language independently.
- Score: 0.3093890460224435
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this work, we propose a flow-adapter architecture for unsupervised NMT. It
leverages normalizing flows to explicitly model the distributions of
sentence-level latent representations, which are subsequently used in
conjunction with the attention mechanism for the translation task. The primary
novelties of our model are: (a) capturing language-specific sentence
representations separately for each language using normalizing flows and (b)
using a simple transformation of these latent representations for translating
from one language to another. This architecture allows for unsupervised
training of each language independently. While there is prior work on latent
variables for supervised MT, to the best of our knowledge, this is the first
work that uses latent variables and normalizing flows for unsupervised MT. We
obtain competitive results on several unsupervised MT benchmarks.
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - CLIPTrans: Transferring Visual Knowledge with Pre-trained Models for
Multimodal Machine Translation [31.911593690549633]
multimodal machine translation (MMT) systems enhance neural machine translation (NMT) with visual knowledge.
Previous works face a challenge in training powerful MMT models from scratch due to the scarcity of annotated multilingual vision-language data.
We propose CLIPTrans, which simply adapts the independently pre-trained multimodal M-CLIP and the multilingual mBART.
arXiv Detail & Related papers (2023-08-29T11:29:43Z) - Bilingual Synchronization: Restoring Translational Relationships with
Editing Operations [2.0411082897313984]
We consider a more general setting which assumes an initial target sequence, that must be transformed into a valid translation of the source.
Our results suggest that one single generic edit-based system, once fine-tuned, can compare with, or even outperform, dedicated systems specifically trained for these tasks.
arXiv Detail & Related papers (2022-10-24T12:25:44Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT [64.1841519527504]
neural machine translation uses a single neural network to model the entire translation process.
Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
arXiv Detail & Related papers (2021-09-03T09:38:50Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Variational Neural Machine Translation with Normalizing Flows [13.537869825364718]
Variational Neural Machine Translation (VNMT) is an attractive framework for modeling the generation of target translations.
We propose to apply the VNMT framework to the state-of-the-art Transformer and introduce a more flexible approximate posterior based on normalizing flows.
arXiv Detail & Related papers (2020-05-28T13:30:53Z) - Cross-lingual Supervision Improves Unsupervised Neural Machine
Translation [97.84871088440102]
We introduce a multilingual unsupervised NMT framework to leverage weakly supervised signals from high-resource language pairs to zero-resource translation directions.
Method significantly improves the translation quality by more than 3 BLEU score on six benchmark unsupervised translation directions.
arXiv Detail & Related papers (2020-04-07T05:46:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.