Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation
- URL: http://arxiv.org/abs/2204.06812v1
- Date: Thu, 14 Apr 2022 08:16:28 GMT
- Title: Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation
- Authors: Xiangpeng Wei, Heng Yu, Yue Hu, Rongxiang Weng, Weihua Luo, Jun Xie,
Rong Jin
- Abstract summary: We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
- Score: 50.54059385277964
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The principal task in supervised neural machine translation (NMT) is to learn
to generate target sentences conditioned on the source inputs from a set of
parallel sentence pairs, and thus produce a model capable of generalizing to
unseen instances. However, it is commonly observed that the generalization
performance of the model is highly influenced by the amount of parallel data
used in training. Although data augmentation is widely used to enrich the
training data, conventional methods with discrete manipulations fail to
generate diverse and faithful training samples. In this paper, we present a
novel data augmentation paradigm termed Continuous Semantic Augmentation
(CsaNMT), which augments each training instance with an adjacency semantic
region that could cover adequate variants of literal expression under the same
meaning. We conduct extensive experiments on both rich-resource and
low-resource settings involving various language pairs, including WMT14
English-{German,French}, NIST Chinese-English and multiple low-resource IWSLT
translation tasks. The provided empirical evidences show that CsaNMT sets a new
level of performance among existing augmentation techniques, improving on the
state-of-the-art by a large margin. The core codes are contained in Appendix E.
Related papers
- Relevance-guided Neural Machine Translation [5.691028372215281]
We propose an explainability-based training approach for Neural Machine Translation (NMT)
Our results show our method can be promising, particularly when training in low-resource conditions.
arXiv Detail & Related papers (2023-11-30T21:52:02Z) - Extending Multilingual Machine Translation through Imitation Learning [60.15671816513614]
Imit-MNMT treats the task as an imitation learning process, which mimicks the behavior of an expert.
We show that our approach significantly improves the translation performance between the new and the original languages.
We also demonstrate that our approach is capable of solving copy and off-target problems.
arXiv Detail & Related papers (2023-11-14T21:04:03Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Semi-supervised Neural Machine Translation with Consistency
Regularization for Low-Resource Languages [3.475371300689165]
This paper presents a simple yet effective method to tackle the problem for low-resource languages by augmenting high-quality sentence pairs and training NMT models in a semi-supervised manner.
Specifically, our approach combines the cross-entropy loss for supervised learning with KL Divergence for unsupervised fashion given pseudo and augmented target sentences.
Experimental results show that our approach significantly improves NMT baselines, especially on low-resource datasets with 0.46--2.03 BLEU scores.
arXiv Detail & Related papers (2023-04-02T15:24:08Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Synthetic Source Language Augmentation for Colloquial Neural Machine
Translation [3.303435360096988]
We develop a novel colloquial Indonesian-English test-set collected from YouTube transcript and Twitter.
We perform synthetic style augmentation to the source of formal Indonesian language and show that it improves the baseline Id-En models.
arXiv Detail & Related papers (2020-12-30T14:52:15Z) - Uncertainty-Aware Semantic Augmentation for Neural Machine Translation [37.555675157198145]
We propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences.
Our approach significantly outperforms the strong baselines and the existing methods.
arXiv Detail & Related papers (2020-10-09T07:48:09Z) - Learning Source Phrase Representations for Neural Machine Translation [65.94387047871648]
We propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations.
In our experiments, we obtain significant improvements on the WMT 14 English-German and English-French tasks on top of the strong Transformer baseline.
arXiv Detail & Related papers (2020-06-25T13:43:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.