Semantically Consistent Data Augmentation for Neural Machine Translation
via Conditional Masked Language Model
- URL: http://arxiv.org/abs/2209.10875v1
- Date: Thu, 22 Sep 2022 09:19:08 GMT
- Title: Semantically Consistent Data Augmentation for Neural Machine Translation
via Conditional Masked Language Model
- Authors: Qiao Cheng, Jin Huang, Yitao Duan
- Abstract summary: This paper introduces a new data augmentation method for neural machine translation.
Our method is based on Conditional Masked Language Model (CMLM)
We show that CMLM is capable of enforcing semantic consistency by conditioning on both source and target during substitution.
- Score: 5.756426081817803
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces a new data augmentation method for neural machine
translation that can enforce stronger semantic consistency both within and
across languages. Our method is based on Conditional Masked Language Model
(CMLM) which is bi-directional and can be conditional on both left and right
context, as well as the label. We demonstrate that CMLM is a good technique for
generating context-dependent word distributions. In particular, we show that
CMLM is capable of enforcing semantic consistency by conditioning on both
source and target during substitution. In addition, to enhance diversity, we
incorporate the idea of soft word substitution for data augmentation which
replaces a word with a probabilistic distribution over the vocabulary.
Experiments on four translation datasets of different scales show that the
overall solution results in more realistic data augmentation and better
translation quality. Our approach consistently achieves the best performance in
comparison with strong and recent works and yields improvements of up to 1.90
BLEU points over the baseline.
Related papers
- Deterministic Reversible Data Augmentation for Neural Machine Translation [36.10695293724949]
We propose Deterministic Reversible Data Augmentation (DRDA), a simple but effective data augmentation method for neural machine translation.
With no extra corpora or model changes required, DRDA outperforms strong baselines on several translation tasks with a clear margin.
DRDA exhibits good robustness in noisy, low-resource, and cross-domain datasets.
arXiv Detail & Related papers (2024-06-04T17:39:23Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - Always Keep your Target in Mind: Studying Semantics and Improving
Performance of Neural Lexical Substitution [124.99894592871385]
We present a large-scale comparative study of lexical substitution methods employing both old and most recent language models.
We show that already competitive results achieved by SOTA LMs/MLMs can be further substantially improved if information about the target word is injected properly.
arXiv Detail & Related papers (2022-06-07T16:16:19Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - Improving Multilingual Translation by Representation and Gradient
Regularization [82.42760103045083]
We propose a joint approach to regularize NMT models at both representation-level and gradient-level.
Our results demonstrate that our approach is highly effective in both reducing off-target translation occurrences and improving zero-shot translation performance.
arXiv Detail & Related papers (2021-09-10T10:52:21Z) - On the Language Coverage Bias for Neural Machine Translation [81.81456880770762]
Language coverage bias is important for neural machine translation (NMT) because the target-original training data is not well exploited in current practice.
By carefully designing experiments, we provide comprehensive analyses of the language coverage bias in the training data.
We propose two simple and effective approaches to alleviate the language coverage bias problem.
arXiv Detail & Related papers (2021-06-07T01:55:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.