Self-Paced Learning for Neural Machine Translation
- URL: http://arxiv.org/abs/2010.04505v2
- Date: Tue, 13 Oct 2020 09:02:09 GMT
- Title: Self-Paced Learning for Neural Machine Translation
- Authors: Yu Wan, Baosong Yang, Derek F. Wong, Yikai Zhou, Lidia S. Chao, Haibo
Zhang, Boxing Chen
- Abstract summary: We propose self-paced learning for neural machine translation (NMT) training.
We show that the proposed model yields better performance than strong baselines.
- Score: 55.41314278859938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have proven that the training of neural machine translation
(NMT) can be facilitated by mimicking the learning process of humans.
Nevertheless, achievements of such kind of curriculum learning rely on the
quality of artificial schedule drawn up with the handcrafted features, e.g.
sentence length or word rarity. We ameliorate this procedure with a more
flexible manner by proposing self-paced learning, where NMT model is allowed to
1) automatically quantify the learning confidence over training examples; and
2) flexibly govern its learning via regulating the loss in each iteration step.
Experimental results over multiple translation tasks demonstrate that the
proposed model yields better performance than strong baselines and those models
trained with human-designed curricula on both translation quality and
convergence speed.
Related papers
- TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Active Learning for Neural Machine Translation [0.0]
We incorporated a technique known Active Learning with the NMT toolkit Joey NMT to reach sufficient accuracy and robust predictions of low-resource language translation.
This work uses transformer-based NMT systems; baseline model (BM), fully trained model (FTM), active learning least confidence based model (ALLCM) and active learning margin sampling based model (ALMSM) when translating English to Hindi.
arXiv Detail & Related papers (2022-12-30T17:04:01Z) - Non-Parametric Online Learning from Human Feedback for Neural Machine
Translation [54.96594148572804]
We study the problem of online learning with human feedback in the human-in-the-loop machine translation.
Previous methods require online model updating or additional translation memory networks to achieve high-quality performance.
We propose a novel non-parametric online learning method without changing the model structure.
arXiv Detail & Related papers (2021-09-23T04:26:15Z) - Language Modeling, Lexical Translation, Reordering: The Training Process
of NMT through the Lens of Classical SMT [64.1841519527504]
neural machine translation uses a single neural network to model the entire translation process.
Despite neural machine translation being de-facto standard, it is still not clear how NMT models acquire different competences over the course of training.
arXiv Detail & Related papers (2021-09-03T09:38:50Z) - Self-supervised and Supervised Joint Training for Resource-rich Machine
Translation [30.502625878505732]
Self-supervised pre-training of text representations has been successfully applied to low-resource Neural Machine Translation (NMT)
We propose a joint training approach, $F$-XEnDec, to combine self-supervised and supervised learning to optimize NMT models.
arXiv Detail & Related papers (2021-06-08T02:35:40Z) - Self-Guided Curriculum Learning for Neural Machine Translation [25.870500301724128]
We propose a self-guided curriculum strategy to encourage the learning of neural machine translation (NMT) models.
Our approach can consistently improve translation performance against strong baseline Transformer.
arXiv Detail & Related papers (2021-05-10T16:12:14Z) - Exploring Fine-tuning Techniques for Pre-trained Cross-lingual Models
via Continual Learning [74.25168207651376]
Fine-tuning pre-trained language models to downstream cross-lingual tasks has shown promising results.
We leverage continual learning to preserve the cross-lingual ability of the pre-trained model when we fine-tune it to downstream tasks.
Our methods achieve better performance than other fine-tuning baselines on the zero-shot cross-lingual part-of-speech tagging and named entity recognition tasks.
arXiv Detail & Related papers (2020-04-29T14:07:18Z) - Self-Induced Curriculum Learning in Self-Supervised Neural Machine
Translation [20.718093208111547]
SSNMT learns to identify and select suitable training data from comparable (rather than parallel) corpora.
In this study, we provide an in-depth analysis of the sampling choices the SSNMT model makes during training.
We show that in terms of the Gunning-Fog Readability index, SSNMT starts extracting and learning from Wikipedia data suitable for high school students.
arXiv Detail & Related papers (2020-04-07T06:45:45Z) - Learning to Multi-Task Learn for Better Neural Machine Translation [53.06405021125476]
Multi-task learning is an elegant approach to inject linguistic-related biases into neural machine translation models.
We propose a novel framework for learning the training schedule, ie learning to multi-task learn, for the biased-MTL setting of interest.
Experiments show the resulting automatically learned training schedulers are competitive with the best, and lead to up to +1.1 BLEU score improvements.
arXiv Detail & Related papers (2020-01-10T03:12:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.