Data Selection Curriculum for Neural Machine Translation
- URL: http://arxiv.org/abs/2203.13867v1
- Date: Fri, 25 Mar 2022 19:08:30 GMT
- Title: Data Selection Curriculum for Neural Machine Translation
- Authors: Tasnim Mohiuddin, Philipp Koehn, Vishrav Chaudhary, James Cross,
Shruti Bhosale, and Shafiq Joty
- Abstract summary: We introduce a two-stage curriculum training framework for NMT models.
We fine-tune a base NMT model on subsets of data, selected by both deterministic scoring using pre-trained methods and online scoring.
We have shown that our curriculum strategies consistently demonstrate better quality (up to +2.2 BLEU improvement) and faster convergence.
- Score: 31.55953464971441
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Machine Translation (NMT) models are typically trained on
heterogeneous data that are concatenated and randomly shuffled. However, not
all of the training data are equally useful to the model. Curriculum training
aims to present the data to the NMT models in a meaningful order. In this work,
we introduce a two-stage curriculum training framework for NMT where we
fine-tune a base NMT model on subsets of data, selected by both deterministic
scoring using pre-trained methods and online scoring that considers prediction
scores of the emerging NMT model. Through comprehensive experiments on six
language pairs comprising low- and high-resource languages from WMT'21, we have
shown that our curriculum strategies consistently demonstrate better quality
(up to +2.2 BLEU improvement) and faster convergence (approximately 50% fewer
updates).
Related papers
- Towards Zero-Shot Multimodal Machine Translation [64.9141931372384]
We propose a method to bypass the need for fully supervised data to train multimodal machine translation systems.
Our method, called ZeroMMT, consists in adapting a strong text-only machine translation (MT) model by training it on a mixture of two objectives.
To prove that our method generalizes to languages with no fully supervised training data available, we extend the CoMMuTE evaluation dataset to three new languages: Arabic, Russian and Chinese.
arXiv Detail & Related papers (2024-07-18T15:20:31Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Better Datastore, Better Translation: Generating Datastores from
Pre-Trained Models for Nearest Neural Machine Translation [48.58899349349702]
Nearest Neighbor Machine Translation (kNNMT) is a simple and effective method of augmenting neural machine translation (NMT) with a token-level nearest neighbor retrieval mechanism.
In this paper, we propose PRED, a framework that leverages Pre-trained models for Datastores in kNN-MT.
arXiv Detail & Related papers (2022-12-17T08:34:20Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Confidence Based Bidirectional Global Context Aware Training Framework
for Neural Machine Translation [74.99653288574892]
We propose a Confidence Based Bidirectional Global Context Aware (CBBGCA) training framework for neural machine translation (NMT)
Our proposed CBBGCA training framework significantly improves the NMT model by +1.02, +1.30 and +0.57 BLEU scores on three large-scale translation datasets.
arXiv Detail & Related papers (2022-02-28T10:24:22Z) - End-to-End Training for Back-Translation with Categorical Reparameterization Trick [0.0]
Back-translation is an effective semi-supervised learning framework in neural machine translation (NMT)
A pre-trained NMT model translates monolingual sentences and makes synthetic bilingual sentence pairs for the training of the other NMT model.
The discrete property of translated sentences prevents information gradient from flowing between the two NMT models.
arXiv Detail & Related papers (2022-02-17T06:31:03Z) - Improving Neural Machine Translation by Denoising Training [95.96569884410137]
We present a simple and effective pretraining strategy Denoising Training DoT for neural machine translation.
We update the model parameters with source- and target-side denoising tasks at the early stage and then tune the model normally.
Experiments show DoT consistently improves the neural machine translation performance across 12 bilingual and 16 multilingual directions.
arXiv Detail & Related papers (2022-01-19T00:11:38Z) - Multi-task Learning for Multilingual Neural Machine Translation [32.81785430242313]
We propose a multi-task learning framework that jointly trains the model with the translation task on bitext data and two denoising tasks on the monolingual data.
We show that the proposed approach can effectively improve the translation quality for both high-resource and low-resource languages.
arXiv Detail & Related papers (2020-10-06T06:54:12Z) - Reinforced Curriculum Learning on Pre-trained Neural Machine Translation
Models [20.976165305749777]
We learn a curriculum for improving a pre-trained NMT model by re-selecting influential data samples from the original training set.
We propose a data selection framework based on Deterministic Actor-Critic, in which a critic network predicts the expected change of model performance.
arXiv Detail & Related papers (2020-04-13T03:40:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.