Related papers: Adversarial Subword Regularization for Robust Neural Machine Translation

Adversarial Subword Regularization for Robust Neural Machine Translation

URL: http://arxiv.org/abs/2004.14109v2
Date: Thu, 8 Oct 2020 05:25:34 GMT
Title: Adversarial Subword Regularization for Robust Neural Machine Translation
Authors: Jungsoo Park, Mujeen Sung, Jinhyuk Lee, Jaewoo Kang
Abstract summary: Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation. We present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations.
Score: 23.968624881678913
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Exposing diverse subword segmentations to neural machine translation (NMT) models often improves the robustness of machine translation as NMT models can experience various subword candidates. However, the diversification of subword segmentations mostly relies on the pre-trained subword language models from which erroneous segmentations of unseen words are less likely to be sampled. In this paper, we present adversarial subword regularization (ADVSR) to study whether gradient signals during training can be a substitute criterion for exposing diverse subword segmentations. We experimentally show that our model-based adversarial samples effectively encourage NMT models to be less sensitive to segmentation errors and improve the performance of NMT models in low-resource and out-domain datasets.

Related papers

An Analysis of BPE Vocabulary Trimming in Neural Machine Translation [56.383793805299234]
vocabulary trimming is a postprocessing step that replaces rare subwords with their component subwords. We show that vocabulary trimming fails to improve performance and is even prone to incurring heavy degradation.
arXiv Detail & Related papers (2024-03-30T15:29:49Z)
Subword Segmental Machine Translation: Unifying Segmentation and Target Sentence Generation [7.252933737829635]
Subword segmental machine translation (SSMT) learns to segment target sentence words while jointly learning to generate target sentences. Experiments across 6 translation directions show that SSMT improves chrF scores for morphologically rich agglutinative languages.
arXiv Detail & Related papers (2023-05-11T17:44:29Z)
Subword Segmental Language Modelling for Nguni Languages [7.252933737829635]
Subword segmental language model (SSLM) learns how to segment words while being trained for autoregressive language modelling. We train our model on the 4 Nguni languages of South Africa. Our results show that learning subword segmentation is an effective alternative to existing subword segmenters.
arXiv Detail & Related papers (2022-10-12T18:41:00Z)
Learning to Generalize to More: Continuous Semantic Augmentation for Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT) CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z)
Single Model Ensemble for Subword Regularized Models in Low-Resource Machine Translation [25.04086897886412]
Subword regularizations use multiple subword segmentations during training to improve the robustness of neural machine translation models. We propose an inference strategy to address this discrepancy. Experimental results show that the proposed strategy improves the performance of models trained with subword regularization.
arXiv Detail & Related papers (2022-03-25T09:25:47Z)
Exploring Unsupervised Pretraining Objectives for Machine Translation [99.5441395624651]
Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT) Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence architectures, by masking parts of the input and reconstructing them in the decoder. We compare masking with alternative objectives that produce inputs resembling real (full) sentences, by reordering and replacing words based on their context.
arXiv Detail & Related papers (2021-06-10T10:18:23Z)
A Correspondence Variational Autoencoder for Unsupervised Acoustic Word Embeddings [50.524054820564395]
We propose a new unsupervised model for mapping a variable-duration speech segment to a fixed-dimensional representation. The resulting acoustic word embeddings can form the basis of search, discovery, and indexing systems for low- and zero-resource languages.
arXiv Detail & Related papers (2020-12-03T19:24:42Z)
Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks [84.61578555312288]
We introduce a method for the prediction of disambiguation errors based on statistical data properties. We develop a simple adversarial attack strategy that minimally perturbs sentences in order to elicit disambiguation errors. Our findings indicate that disambiguation robustness varies substantially between domains and that different models trained on the same data are vulnerable to different attacks.
arXiv Detail & Related papers (2020-11-03T17:01:44Z)
A Comparative Study of Lexical Substitution Approaches based on Neural Language Models [117.96628873753123]
We present a large-scale comparative study of popular neural language and masked language models. We show that already competitive results achieved by SOTA LMs/MLMs can be further improved if information about the target word is injected properly.
arXiv Detail & Related papers (2020-05-29T18:43:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.