Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences
on Neural Machine Translation
- URL: http://arxiv.org/abs/2105.15087v1
- Date: Mon, 31 May 2021 16:15:35 GMT
- Title: Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences
on Neural Machine Translation
- Authors: Eleftheria Briakou and Marine Carpuat
- Abstract summary: We analyze the impact of different types of fine-grained semantic divergences on Transformer models.
We introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences.
- Score: 14.645468999921961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While it has been shown that Neural Machine Translation (NMT) is highly
sensitive to noisy parallel training samples, prior work treats all types of
mismatches between source and target as noise. As a result, it remains unclear
how samples that are mostly equivalent but contain a small number of
semantically divergent tokens impact NMT training. To close this gap, we
analyze the impact of different types of fine-grained semantic divergences on
Transformer models. We show that models trained on synthetic divergences output
degenerated text more frequently and are less confident in their predictions.
Based on these findings, we introduce a divergent-aware NMT framework that uses
factors to help NMT recover from the degradation caused by naturally occurring
divergences, improving both translation quality and model calibration on EN-FR
tasks.
Related papers
- Towards Reliable Neural Machine Translation with Consistency-Aware
Meta-Learning [24.64700139151659]
Current Neural machine translation (NMT) systems suffer from a lack of reliability.
We present a consistency-aware meta-learning (CAML) framework derived from the model-agnostic meta-learning (MAML) algorithm to address it.
We conduct experiments on the NIST Chinese to English task, three WMT translation tasks, and the TED M2O task.
arXiv Detail & Related papers (2023-03-20T09:41:28Z) - TransFool: An Adversarial Attack against Neural Machine Translation
Models [49.50163349643615]
We investigate the vulnerability of Neural Machine Translation (NMT) models to adversarial attacks and propose a new attack algorithm called TransFool.
We generate fluent adversarial examples in the source language that maintain a high level of semantic similarity with the clean samples.
Based on automatic and human evaluations, TransFool leads to improvement in terms of success rate, semantic similarity, and fluency compared to the existing attacks.
arXiv Detail & Related papers (2023-02-02T08:35:34Z) - Categorizing Semantic Representations for Neural Machine Translation [53.88794787958174]
We introduce categorization to the source contextualized representations.
The main idea is to enhance generalization by reducing sparsity and overfitting.
Experiments on a dedicated MT dataset show that our method reduces compositional generalization error rates by 24% error reduction.
arXiv Detail & Related papers (2022-10-13T04:07:08Z) - Generating Authentic Adversarial Examples beyond Meaning-preserving with
Doubly Round-trip Translation [64.16077929617119]
We propose a new criterion for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT)
To enhance the robustness of the NMT model, we introduce the masked language models to construct bilingual adversarial pairs.
arXiv Detail & Related papers (2022-04-19T06:15:27Z) - Doubly-Trained Adversarial Data Augmentation for Neural Machine
Translation [8.822338727711715]
We generate adversarial augmentation samples that attack the model and preserve the source-side semantic meaning.
The results from our experiments show that these adversarial samples improve the model robustness.
arXiv Detail & Related papers (2021-10-12T02:23:00Z) - Alternated Training with Synthetic and Authentic Data for Neural Machine
Translation [49.35605028467887]
We propose alternated training with synthetic and authentic data for neural machine translation (NMT)
Compared with previous work, we introduce authentic data as guidance to prevent the training of NMT models from being disturbed by noisy synthetic data.
Experiments on Chinese-English and German-English translation tasks show that our approach improves the performance over several strong baselines.
arXiv Detail & Related papers (2021-06-16T07:13:16Z) - On Long-Tailed Phenomena in Neural Machine Translation [50.65273145888896]
State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens.
We propose a new loss function, the Anti-Focal loss, to better adapt model training to the structural dependencies of conditional text generation.
We show the efficacy of the proposed technique on a number of Machine Translation (MT) datasets, demonstrating that it leads to significant gains over cross-entropy.
arXiv Detail & Related papers (2020-10-10T07:00:57Z) - Uncertainty-Aware Semantic Augmentation for Neural Machine Translation [37.555675157198145]
We propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences.
Our approach significantly outperforms the strong baselines and the existing methods.
arXiv Detail & Related papers (2020-10-09T07:48:09Z) - The Unreasonable Volatility of Neural Machine Translation Models [5.44772285850031]
We investigate the unexpected volatility of NMT models where the input is semantically and syntactically correct.
We find that both RNN and Transformer models display volatile behavior in 26% and 19% of sentence variations, respectively.
arXiv Detail & Related papers (2020-05-25T20:54:23Z) - On the Inference Calibration of Neural Machine Translation [54.48932804996506]
We study the correlation between calibration and translation performance and linguistic properties of miscalibration.
We propose a new graduated label smoothing method that can improve both inference calibration and translation performance.
arXiv Detail & Related papers (2020-05-03T02:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.