Related papers: Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT

Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT

URL: http://arxiv.org/abs/2109.01396v1
Date: Fri, 3 Sep 2021 09:38:50 GMT
Title: Language Modeling, Lexical Translation, Reordering: The Training Process of NMT through the Lens of Classical SMT
Authors: Elena Voita, Rico Sennrich, Ivan Titov
Abstract summary: neural machine translation uses a single neural network to model the entire translation process. Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
Score: 64.1841519527504
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differently from the traditional statistical MT that decomposes the translation task into distinct separately learned components, neural machine translation uses a single neural network to model the entire translation process. Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training, and how this mirrors the different models in traditional SMT. In this work, we look at the competences related to three core SMT components and find that during training, NMT first focuses on learning target-side language modeling, then improves translation quality approaching word-by-word translation, and finally learns more complicated reordering patterns. We show that this behavior holds for several models and language pairs. Additionally, we explain how such an understanding of the training process can be useful in practice and, as an example, show how it can be used to improve vanilla non-autoregressive neural machine translation by guiding teacher model selection.

Related papers

On the Shortcut Learning in Multilingual Neural Machine Translation [95.30470845501141]
This study revisits the commonly-cited off-target issue in multilingual neural machine translation (MNMT) We attribute the off-target issue to the overfitting of the shortcuts of (non-centric, centric) language mappings. Analyses on learning dynamics show that the shortcut learning generally occurs in the later stage of model training.
arXiv Detail & Related papers (2024-11-15T21:09:36Z)
Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT) Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder. We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z)
Self-supervised and Supervised Joint Training for Resource-rich Machine Translation [30.502625878505732]
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT) We propose a joint training approach, $F$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models.
arXiv Detail & Related papers (2021-06-08T02:35:40Z)
An empirical analysis of phrase-based and neural machine translation [0.0]
Two popular types of machine translation (MT) are phrase-based and neural machine translation systems. We study the behavior of important models in both phrase-based and neural MT systems.
arXiv Detail & Related papers (2021-03-04T15:28:28Z)
Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models [72.56058378313963]
We bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table. We find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples.
arXiv Detail & Related papers (2020-04-28T03:44:34Z)
Neural Machine Translation: Challenges, Progress and Future [62.75523637241876]
Machine translation (MT) is a technique that leverages computers to translate human languages automatically. neural machine translation (NMT) models direct mapping between source and target languages with deep neural networks. This article makes a review of NMT framework, discusses the challenges in NMT and introduces some exciting recent progresses.
arXiv Detail & Related papers (2020-04-13T07:53:57Z)
Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence. We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.