Assessing the Bilingual Knowledge Learned by Neural Machine Translation
Models
- URL: http://arxiv.org/abs/2004.13270v1
- Date: Tue, 28 Apr 2020 03:44:34 GMT
- Title: Assessing the Bilingual Knowledge Learned by Neural Machine Translation
Models
- Authors: Shilin He, Xing Wang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu
- Abstract summary: We bridge the gap by assessing the bilingual knowledge learned by NMT models with phrase table.
We find that NMT models learn patterns from simple to complex and distill essential bilingual knowledge from the training examples.
- Score: 72.56058378313963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine translation (MT) systems translate text between different languages
by automatically learning in-depth knowledge of bilingual lexicons, grammar and
semantics from the training examples. Although neural machine translation (NMT)
has led the field of MT, we have a poor understanding on how and why it works.
In this paper, we bridge the gap by assessing the bilingual knowledge learned
by NMT models with phrase table -- an interpretable table of bilingual
lexicons. We extract the phrase table from the training examples that an NMT
model correctly predicts. Extensive experiments on widely-used datasets show
that the phrase table is reasonable and consistent against language pairs and
random seeds. Equipped with the interpretable phrase table, we find that NMT
models learn patterns from simple to complex and distill essential bilingual
knowledge from the training examples. We also revisit some advances that
potentially affect the learning of bilingual knowledge (e.g.,
back-translation), and report some interesting findings. We believe this work
opens a new angle to interpret NMT with statistic models, and provides
empirical supports for recent advances in improving NMT models.
Related papers
- MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation [61.65537912700187]
Large Language Models (LLM) have demonstrated their strong ability in the field of machine translation (MT)
We propose a framework called MT-Patcher, which transfers knowledge from LLMs to existing MT models in a selective, comprehensive and proactive manner.
arXiv Detail & Related papers (2024-03-14T16:07:39Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Dict-NMT: Bilingual Dictionary based NMT for Extremely Low Resource
Languages [1.8787713898828164]
We present a detailed analysis of the effects of the quality of dictionaries, training dataset size, language family, etc., on the translation quality.
Results on multiple low-resource test languages show a clear advantage of our bilingual dictionary-based method over the baselines.
arXiv Detail & Related papers (2022-06-09T12:03:29Z) - Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT [64.1841519527504]
neural machine translation uses a single neural network to model the entire translation process.
Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
arXiv Detail & Related papers (2021-09-03T09:38:50Z) - Better Neural Machine Translation by Extracting Linguistic Information
from BERT [4.353029347463806]
Adding linguistic information to neural machine translation (NMT) has mostly focused on using point estimates from pre-trained models.
We augment NMT by extracting dense fine-tuned vector-based linguistic information from BERT instead of using point estimates.
arXiv Detail & Related papers (2021-04-07T00:03:51Z) - Improving the Lexical Ability of Pretrained Language Models for
Unsupervised Neural Machine Translation [127.81351683335143]
Cross-lingual pretraining requires models to align the lexical- and high-level representations of the two languages.
Previous research has shown that this is because the representations are not sufficiently aligned.
In this paper, we enhance the bilingual masked language model pretraining with lexical-level information by using type-level cross-lingual subword embeddings.
arXiv Detail & Related papers (2021-03-18T21:17:58Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Multi-task Learning for Multilingual Neural Machine Translation [32.81785430242313]
We propose a multi-task learning framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data.
We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages.
arXiv Detail & Related papers (2020-10-06T06:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.