Revisiting Robust Neural Machine Translation: A Transformer Case Study
- URL: http://arxiv.org/abs/2012.15710v1
- Date: Thu, 31 Dec 2020 16:55:05 GMT
- Title: Revisiting Robust Neural Machine Translation: A Transformer Case Study
- Authors: Peyman Passban, Puneeth S.M. Saladi, Qun Liu
- Abstract summary: We investigate how noise breaks Transformers and if there exist solutions to deal with such issues.
We introduce a novel data-driven technique to incorporate noise during training.
We propose two new extensions to the original Transformer, that modify the neural architecture as well as the training process to handle noise.
- Score: 30.70732321809362
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformers (Vaswani et al., 2017) have brought a remarkable improvement in
the performance of neural machine translation (NMT) systems, but they could be
surprisingly vulnerable to noise. Accordingly, we tried to investigate how
noise breaks Transformers and if there exist solutions to deal with such
issues. There is a large body of work in the NMT literature on analyzing the
behaviour of conventional models for the problem of noise but it seems
Transformers are understudied in this context.
Therefore, we introduce a novel data-driven technique to incorporate noise
during training. This idea is comparable to the well-known fine-tuning
strategy. Moreover, we propose two new extensions to the original Transformer,
that modify the neural architecture as well as the training process to handle
noise. We evaluated our techniques to translate the English--German pair in
both directions. Experimental results show that our models have a higher
tolerance to noise. More specifically, they perform with no deterioration where
up to 10% of entire test words are infected by noise.
Related papers
- Pivotal Auto-Encoder via Self-Normalizing ReLU [20.76999663290342]
We formalize single hidden layer sparse auto-encoders as a transform learning problem.
We propose an optimization problem that leads to a predictive model invariant to the noise level at test time.
Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise.
arXiv Detail & Related papers (2024-06-23T09:06:52Z) - Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC)
NPC consists of a detection module and a correction module.
We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z) - Learning Provably Robust Estimators for Inverse Problems via Jittering [51.467236126126366]
We investigate whether jittering, a simple regularization technique, is effective for learning worst-case robust estimators for inverse problems.
We show that jittering significantly enhances the worst-case robustness, but can be suboptimal for inverse problems beyond denoising.
arXiv Detail & Related papers (2023-07-24T14:19:36Z) - Walking Noise: On Layer-Specific Robustness of Neural Architectures against Noisy Computations and Associated Characteristic Learning Dynamics [1.5184189132709105]
We discuss the implications of additive, multiplicative and mixed noise for different classification tasks and model architectures.
We propose a methodology called Walking Noise which injects layer-specific noise to measure the robustness.
We conclude with a discussion of the use of this methodology in practice, among others, discussing its use for tailored multi-execution in noisy environments.
arXiv Detail & Related papers (2022-12-20T17:09:08Z) - Phrase-level Adversarial Example Generation for Neural Machine
Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z) - Efficient Training of Audio Transformers with Patchout [7.073210405344709]
We propose a novel method to optimize and regularize transformers on audio spectrograms.
The proposed models achieve a new state-of-the-art performance on Audioset and can be trained on a single consumer-grade GPU.
arXiv Detail & Related papers (2021-10-11T08:07:50Z) - Rethinking Noise Synthesis and Modeling in Raw Denoising [75.55136662685341]
We introduce a new perspective to synthesize noise by directly sampling from the sensor's real noise.
It inherently generates accurate raw image noise for different camera sensors.
arXiv Detail & Related papers (2021-10-10T10:45:24Z) - Addressing the Vulnerability of NMT in Input Perturbations [10.103375853643547]
We improve robustness of NMT models by reducing the effect of noisy words through a Context-Enhanced Reconstruction approach.
CER trains the model to resist noise in two steps: (1) step that breaks the naturalness of input sequence with made-up words; (2) reconstruction step that defends the noise propagation by generating better and more robust contextual representation.
arXiv Detail & Related papers (2021-04-20T07:52:58Z) - What all do audio transformer models hear? Probing Acoustic
Representations for Language Delivery and its Structure [64.54208910952651]
We compare audio transformer models Mockingjay and wave2vec2.0.
We probe the audio models' understanding of textual surface, syntax, and semantic features.
We do this over exhaustive settings for native, non-native, synthetic, read and spontaneous speech datasets.
arXiv Detail & Related papers (2021-01-02T06:29:12Z) - Robust Unsupervised Neural Machine Translation with Adversarial
Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community.
The main advantage of the UNMT lies in its easy collection of required large training text sentences.
In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.