Addressing the Vulnerability of NMT in Input Perturbations
- URL: http://arxiv.org/abs/2104.09810v1
- Date: Tue, 20 Apr 2021 07:52:58 GMT
- Title: Addressing the Vulnerability of NMT in Input Perturbations
- Authors: Weiwen Xu, Ai Ti Aw, Yang Ding, Kui Wu, Shafiq Joty
- Abstract summary: We improve robustness of NMT models by reducing the effect of noisy words through a Context-Enhanced Reconstruction approach.
CER trains the model to resist noise in two steps: (1) step that breaks the naturalness of input sequence with made-up words; (2) reconstruction step that defends the noise propagation by generating better and more robust contextual representation.
- Score: 10.103375853643547
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural Machine Translation (NMT) has achieved significant breakthrough in
performance but is known to suffer vulnerability to input perturbations. As
real input noise is difficult to predict during training, robustness is a big
issue for system deployment. In this paper, we improve the robustness of NMT
models by reducing the effect of noisy words through a Context-Enhanced
Reconstruction (CER) approach. CER trains the model to resist noise in two
steps: (1) perturbation step that breaks the naturalness of input sequence with
made-up words; (2) reconstruction step that defends the noise propagation by
generating better and more robust contextual representation. Experimental
results on Chinese-English (ZH-EN) and French-English (FR-EN) translation tasks
demonstrate robustness improvement on both news and social media text. Further
fine-tuning experiments on social media text show our approach can converge at
a higher position and provide a better adaptation.
Related papers
- Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input.
We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise.
We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z) - Phrase-level Adversarial Example Generation for Neural Machine
Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z) - Frequency-Aware Contrastive Learning for Neural Machine Translation [24.336356651877388]
Low-frequency word prediction remains a challenge in modern neural machine translation (NMT) systems.
Inspired by the observation that low-frequency words form a more compact embedding space, we tackle this challenge from a representation learning perspective.
We propose a frequency-aware token-level contrastive learning method, in which the hidden state of each decoding step is pushed away from the counterparts of other target words.
arXiv Detail & Related papers (2021-12-29T10:10:10Z) - Improving Noise Robustness of Contrastive Speech Representation Learning
with Speech Reconstruction [109.44933866397123]
Noise robustness is essential for deploying automatic speech recognition systems in real-world environments.
We employ a noise-robust representation learned by a refined self-supervised framework for noisy speech recognition.
We achieve comparable performance to the best supervised approach reported with only 16% of labeled data.
arXiv Detail & Related papers (2021-10-28T20:39:02Z) - Improving Translation Robustness with Visual Cues and Error Correction [58.97421756225425]
We introduce the idea of visual context to improve translation robustness against noisy texts.
We also propose a novel error correction training regime by treating error correction as an auxiliary task.
arXiv Detail & Related papers (2021-03-12T15:31:34Z) - Revisiting Robust Neural Machine Translation: A Transformer Case Study [30.70732321809362]
We investigate how noise breaks Transformers and if there exist solutions to deal with such issues.
We introduce a novel data-driven technique to incorporate noise during training.
We propose two new extensions to the original Transformer, that modify the neural architecture as well as the training process to handle noise.
arXiv Detail & Related papers (2020-12-31T16:55:05Z) - Modeling Homophone Noise for Robust Neural Machine Translation [23.022527815382862]
The framework consists of a homophone noise detector and a syllable-aware NMT model to homophone errors.
The detector identifies potential homophone errors in a textual sentence and converts them into syllables to form a mixed sequence that is then fed into the syllable-aware NMT.
arXiv Detail & Related papers (2020-12-15T16:12:04Z) - Sentence Boundary Augmentation For Neural Machine Translation Robustness [11.290581889247983]
We show that sentence boundary segmentation has the largest impact on quality, and we develop a simple data augmentation strategy to improve segmentation robustness.
We show that sentence boundary segmentation has the largest impact on quality, and we develop a simple data augmentation strategy to improve segmentation robustness.
arXiv Detail & Related papers (2020-10-21T16:44:48Z) - On Long-Tailed Phenomena in Neural Machine Translation [50.65273145888896]
State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens.
We propose a new loss function, the Anti-Focal loss, to better adapt model training to the structural dependencies of conditional text generation.
We show the efficacy of the proposed technique on a number of Machine Translation (MT) datasets, demonstrating that it leads to significant gains over cross-entropy.
arXiv Detail & Related papers (2020-10-10T07:00:57Z) - Robust Unsupervised Neural Machine Translation with Adversarial
Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community.
The main advantage of the UNMT lies in its easy collection of required large training text sentences.
In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.