Effective General-Domain Data Inclusion for the Machine Translation Task
by Vanilla Transformers
- URL: http://arxiv.org/abs/2209.14073v1
- Date: Wed, 28 Sep 2022 13:14:49 GMT
- Title: Effective General-Domain Data Inclusion for the Machine Translation Task
by Vanilla Transformers
- Authors: Hassan Soliman
- Abstract summary: We aim at a Transformer-based system that is able to translate a source sentence in German to its counterpart target sentence in English.
We perform experiments on the news commentary German-English parallel sentences from the WMT'13 dataset.
We find that including the IWSLT'16 dataset in training helps achieve a gain of 2 BLEU score points on the test set of the WMT'13 dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the vital breakthroughs in the history of machine translation is the
development of the Transformer model. Not only it is revolutionary for various
translation tasks, but also for a majority of other NLP tasks. In this paper,
we aim at a Transformer-based system that is able to translate a source
sentence in German to its counterpart target sentence in English. We perform
the experiments on the news commentary German-English parallel sentences from
the WMT'13 dataset. In addition, we investigate the effect of the inclusion of
additional general-domain data in training from the IWSLT'16 dataset to improve
the Transformer model performance. We find that including the IWSLT'16 dataset
in training helps achieve a gain of 2 BLEU score points on the test set of the
WMT'13 dataset. Qualitative analysis is introduced to analyze how the usage of
general-domain data helps improve the quality of the produced translation
sentences.
Related papers
- Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Improved Data Augmentation for Translation Suggestion [28.672227843541656]
This paper introduces the system used in our submission to the WMT'22 Translation Suggestion shared task.
We use three strategies to construct synthetic data from parallel corpora to compensate for the lack of supervised data.
We rank second and third on the English-German and English-Chinese bidirectional tasks, respectively.
arXiv Detail & Related papers (2022-10-12T12:46:43Z) - Domain-Specific Text Generation for Machine Translation [7.803471587734353]
We propose a novel approach to domain adaptation leveraging state-of-the-art pretrained language models (LMs) for domain-specific data augmentation.
We employ mixed fine-tuning to train models that significantly improve translation of in-domain texts.
arXiv Detail & Related papers (2022-08-11T16:22:16Z) - Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - Improving Neural Machine Translation by Bidirectional Training [85.64797317290349]
We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation.
Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally.
Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs significantly higher.
arXiv Detail & Related papers (2021-09-16T07:58:33Z) - Netmarble AI Center's WMT21 Automatic Post-Editing Shared Task
Submission [6.043109546012043]
This paper describes Netmarble's submission to WMT21 Automatic Post-Editing (APE) Shared Task for the English-German language pair.
Facebook Fair's WMT19 news translation model was chosen to engage the large and powerful pre-trained neural networks.
For better performance, we leverage external translations as augmented machine translation (MT) during the post-training and fine-tuning.
arXiv Detail & Related papers (2021-09-14T08:21:18Z) - Meta Back-translation [111.87397401837286]
We propose a novel method to generate pseudo-parallel data from a pre-trained back-translation model.
Our method is a meta-learning algorithm which adapts a pre-trained back-translation model so that the pseudo-parallel data it generates would train a forward-translation model to do well on a validation set.
arXiv Detail & Related papers (2021-02-15T20:58:32Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z) - A Simple Baseline to Semi-Supervised Domain Adaptation for Machine
Translation [73.3550140511458]
State-of-the-art neural machine translation (NMT) systems are data-hungry and perform poorly on new domains with no supervised data.
We propose a simple but effect approach to the semi-supervised domain adaptation scenario of NMT.
This approach iteratively trains a Transformer-based NMT model via three training objectives: language modeling, back-translation, and supervised translation.
arXiv Detail & Related papers (2020-01-22T16:42:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.