Related papers: Revisiting Robust Neural Machine Translation: A Transformer Case Study

Revisiting Robust Neural Machine Translation: A Transformer Case Study

URL: http://arxiv.org/abs/2012.15710v1
Date: Thu, 31 Dec 2020 16:55:05 GMT
Title: Revisiting Robust Neural Machine Translation: A Transformer Case Study
Authors: Peyman Passban, Puneeth S.M. Saladi, Qun Liu
Abstract summary: We investigate how noise breaks Transformers and if there exist solutions to deal with such issues. We introduce a novel data-driven technique to incorporate noise during training. We propose two new extensions to the original Transformer, that modify the neural architecture as well as the training process to handle noise.
Score: 30.70732321809362
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformers (Vaswani et al., 2017) have brought a remarkable improvement in the performance of neural machine translation (NMT) systems, but they could be surprisingly vulnerable to noise. Accordingly, we tried to investigate how noise breaks Transformers and if there exist solutions to deal with such issues. There is a large body of work in the NMT literature on analyzing the behaviour of conventional models for the problem of noise but it seems Transformers are understudied in this context. Therefore, we introduce a novel data-driven technique to incorporate noise during training. This idea is comparable to the well-known fine-tuning strategy. Moreover, we propose two new extensions to the original Transformer, that modify the neural architecture as well as the training process to handle noise. We evaluated our techniques to translate the English--German pair in both directions. Experimental results show that our models have a higher tolerance to noise. More specifically, they perform with no deterioration where up to 10% of entire test words are infected by noise.

Related papers

Pivotal Auto-Encoder via Self-Normalizing ReLU [20.76999663290342]
We formalize single hidden layer sparse auto-encoders as a transform learning problem. We propose an optimization problem that leads to a predictive model invariant to the noise level at test time. Our experimental results demonstrate that the trained models yield a significant improvement in stability against varying types of noise.
arXiv Detail & Related papers (2024-06-23T09:06:52Z)
Noisy Pair Corrector for Dense Retrieval [59.312376423104055]
We propose a novel approach called Noisy Pair Corrector (NPC) NPC consists of a detection module and a correction module. We conduct experiments on text-retrieval benchmarks Natural Question and TriviaQA, code-search benchmarks StaQC and SO-DS.
arXiv Detail & Related papers (2023-11-07T08:27:14Z)
Learning Provably Robust Estimators for Inverse Problems via Jittering [51.467236126126366]
We investigate whether jittering, a simple regularization technique, is effective for learning worst-case robust estimators for inverse problems. We show that jittering significantly enhances the worst-case robustness, but can be suboptimal for inverse problems beyond denoising.
arXiv Detail & Related papers (2023-07-24T14:19:36Z)
Walking Noise: On Layer-Specific Robustness of Neural Architectures against Noisy Computations and Associated Characteristic Learning Dynamics [1.5184189132709105]
We discuss the implications of additive, multiplicative and mixed noise for different classification tasks and model architectures. We propose a methodology called Walking Noise which injects layer-specific noise to measure the robustness. We conclude with a discussion of the use of this methodology in practice, among others, discussing its use for tailored multi-execution in noisy environments.
arXiv Detail & Related papers (2022-12-20T17:09:08Z)
Phrase-level Adversarial Example Generation for Neural Machine Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model. We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z)
Efficient Training of Audio Transformers with Patchout [7.073210405344709]
We propose a novel method to optimize and regularize transformers on audio spectrograms. The proposed models achieve a new state-of-the-art performance on Audioset and can be trained on a single consumer-grade GPU.
arXiv Detail & Related papers (2021-10-11T08:07:50Z)
Rethinking Noise Synthesis and Modeling in Raw Denoising [75.55136662685341]
We introduce a new perspective to synthesize noise by directly sampling from the sensor's real noise. It inherently generates accurate raw image noise for different camera sensors.
arXiv Detail & Related papers (2021-10-10T10:45:24Z)
Addressing the Vulnerability of NMT in Input Perturbations [10.103375853643547]
We improve robustness of NMT models by reducing the effect of noisy words through a Context-Enhanced Reconstruction approach. CER trains the model to resist noise in two steps: (1) step that breaks the naturalness of input sequence with made-up words; (2) reconstruction step that defends the noise propagation by generating better and more robust contextual representation.
arXiv Detail & Related papers (2021-04-20T07:52:58Z)
What all do audio transformer models hear? Probing Acoustic Representations for Language Delivery and its Structure [64.54208910952651]
We compare audio transformer models Mockingjay and wave2vec2.0. We probe the audio models' understanding of textual surface, syntax, and semantic features. We do this over exhaustive settings for native, non-native, synthetic, read and spontaneous speech datasets.
arXiv Detail & Related papers (2021-01-02T06:29:12Z)
Robust Unsupervised Neural Machine Translation with Adversarial Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community. The main advantage of the UNMT lies in its easy collection of required large training text sentences. In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.