Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT
- URL: http://arxiv.org/abs/2109.01396v1
- Date: Fri, 3 Sep 2021 09:38:50 GMT
- Title: Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT
- Authors: Elena Voita, Rico Sennrich, Ivan Titov
- Abstract summary: neural machine translation uses a single neural network to model the entire translation process.
Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
- Score: 64.1841519527504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differently from the traditional statistical MT that decomposes the
translation task into distinct separately learned components, neural machine
translation uses a single neural network to model the entire translation
process. Despite neural machine translation being de-facto standard, it is
still not clear how NMT models acquire different competences over the course of
training, and how this mirrors the different models in traditional SMT. In this
work, we look at the competences related to three core SMT components and find
that during training, NMT first focuses on learning target-side language
modeling, then improves translation quality approaching word-by-word
translation, and finally learns more complicated reordering patterns. We show
that this behavior holds for several models and language pairs. Additionally,
we explain how such an understanding of the training process can be useful in
practice and, as an example, show how it can be used to improve vanilla
non-autoregressive neural machine translation by guiding teacher model
selection.
Related papers
- On the Shortcut Learning in Multilingual Neural Machine Translation [95.30470845501141]
This study revisits the commonly-cited off-target issue in multilingual neural machine translation (MNMT)
We attribute the off-target issue to the overfitting of the shortcuts of (non-centric, centric) language mappings.
Analyses on learning dynamics show that the shortcut learning generally occurs in the later stage of model training.
arXiv Detail & Related papers (2024-11-15T21:09:36Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Self-supervised and Supervised Joint Training for Resource-rich Machine
Translation [30.502625878505732]
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT)
We propose a joint training approach, $F$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models.
arXiv Detail & Related papers (2021-06-08T02:35:40Z) - An empirical analysis of phrase-based and neural machine translation [0.0]
Two popular types of machine translation (MT) are phrase-based and neural machine translation systems.
We study the behavior of important models in both phrase-based and neural MT systems.
arXiv Detail & Related papers (2021-03-04T15:28:28Z) - Assessing the Bilingual Knowledge Learned by Neural Machine Translation
Models [72.56058378313963]
We bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table.
We find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples.
arXiv Detail & Related papers (2020-04-28T03:44:34Z) - Neural Machine Translation: Challenges, Progress and Future [62.75523637241876]
Machine translation (MT) is a technique that leverages computers to translate human languages automatically.
neural machine translation (NMT) models direct mapping between source and target languages with deep neural networks.
This article makes a review of NMT framework, discusses the challenges in NMT and introduces some exciting recent progresses.
arXiv Detail & Related papers (2020-04-13T07:53:57Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.