Semi-supervised Neural Machine Translation with Consistency
Regularization for Low-Resource Languages
- URL: http://arxiv.org/abs/2304.00557v1
- Date: Sun, 2 Apr 2023 15:24:08 GMT
- Title: Semi-supervised Neural Machine Translation with Consistency
Regularization for Low-Resource Languages
- Authors: Viet H. Pham, Thang M. Pham, Giang Nguyen, Long Nguyen, Dien Dinh
- Abstract summary: This paper presents a simple yet effective method to tackle the problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner.
Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences.
Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores.
- Score: 3.475371300689165
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The advent of deep learning has led to a significant gain in machine
translation. However, most of the studies required a large parallel dataset
which is scarce and expensive to construct and even unavailable for some
languages. This paper presents a simple yet effective method to tackle this
problem for low-resource languages by augmenting high-quality sentence pairs
and training NMT models in a semi-supervised manner. Specifically, our approach
combines the cross-entropy loss for supervised learning with KL Divergence for
unsupervised fashion given pseudo and augmented target sentences derived from
the model. We also introduce a SentenceBERT-based filter to enhance the quality
of augmenting data by retaining semantically similar sentence pairs.
Experimental results show that our approach significantly improves NMT
baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores. We
also demonstrate that using unsupervised training for augmented data is more
efficient than reusing the ground-truth target sentences for supervised
learning.
Related papers
- Relevance-guided Neural Machine Translation [5.691028372215281]
We propose an explainability-based training approach for Neural Machine Translation (NMT)
Our results show our method can be promising, particularly when training in low-resource conditions.
arXiv Detail & Related papers (2023-11-30T21:52:02Z) - Boosting Unsupervised Machine Translation with Pseudo-Parallel Data [2.900810893770134]
We propose a training strategy that relies on pseudo-parallel sentence pairs mined from monolingual corpora and synthetic sentence pairs back-translated from monolingual corpora.
We reach an improvement of up to 14.5 BLEU points (English to Ukrainian) over a baseline trained on back-translated data only.
arXiv Detail & Related papers (2023-10-22T10:57:12Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - Distributionally Robust Multilingual Machine Translation [94.51866646879337]
We propose a new learning objective for Multilingual neural machine translation (MNMT) based on distributionally robust optimization.
We show how to practically optimize this objective for large translation corpora using an iterated best response scheme.
Our method consistently outperforms strong baseline methods in terms of average and per-language performance under both many-to-one and one-to-many translation settings.
arXiv Detail & Related papers (2021-09-09T03:48:35Z) - Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT)
Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder.
We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z) - Self-Training Sampling with Monolingual Data Uncertainty for Neural
Machine Translation [98.83925811122795]
We propose to improve the sampling procedure by selecting the most informative monolingual sentences to complement the parallel data.
We compute the uncertainty of monolingual sentences using the bilingual dictionary extracted from the parallel data.
Experimental results on large-scale WMT English$Rightarrow$German and English$Rightarrow$Chinese datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2021-06-02T05:01:36Z) - Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in
Non-Autoregressive Translation [98.11249019844281]
Knowledge distillation (KD) is commonly used to construct synthetic data for training non-autoregressive translation (NAT) models.
We propose reverse KD to rejuvenate more alignments for low-frequency target words.
Results demonstrate that the proposed approach can significantly and universally improve translation quality.
arXiv Detail & Related papers (2021-06-02T02:41:40Z) - Fine-tuning BERT for Low-Resource Natural Language Understanding via
Active Learning [30.5853328612593]
In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model.
Our experimental results show an advantage in model performance by maximizing the approximate knowledge gain of the model.
We analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters.
arXiv Detail & Related papers (2020-12-04T08:34:39Z) - Unsupervised Neural Machine Translation for Low-Resource Domains via
Meta-Learning [27.86606560170401]
We present a novel meta-learning algorithm for unsupervised neural machine translation (UNMT)
We train the model to adapt to another domain by utilizing only a small amount of training data.
Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores.
arXiv Detail & Related papers (2020-10-18T17:54:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.