On Compositional Generalization of Neural Machine Translation
- URL: http://arxiv.org/abs/2105.14802v1
- Date: Mon, 31 May 2021 09:04:29 GMT
- Title: On Compositional Generalization of Neural Machine Translation
- Authors: Yafu Li, Yongjing Yin, Yulong Chen and Yue Zhang
- Abstract summary: We study NMT models from the perspective of compositional generalization.
We build a benchmark dataset, CoGnition, consisting of 216k clean and consistent sentence pairs.
We quantitatively analyze effects of various factors using compound translation error rate, then demonstrate that the NMT model fails badly on compositional generalization.
- Score: 11.171958188127961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern neural machine translation (NMT) models have achieved competitive
performance in standard benchmarks such as WMT. However, there still exist
significant issues such as robustness, domain generalization, etc. In this
paper, we study NMT models from the perspective of compositional generalization
by building a benchmark dataset, CoGnition, consisting of 216k clean and
consistent sentence pairs. We quantitatively analyze effects of various factors
using compound translation error rate, then demonstrate that the NMT model
fails badly on compositional generalization, although it performs remarkably
well under traditional metrics.
Related papers
- An Empirical study of Unsupervised Neural Machine Translation: analyzing
NMT output, model's behavior and sentences' contribution [5.691028372215281]
Unsupervised Neural Machine Translation (UNMT) focuses on improving NMT results under the assumption there is no human translated parallel data.
We focus on three very diverse languages, French, Gujarati, and Kazakh, and train bilingual NMT models, to and from English, with various levels of supervision.
arXiv Detail & Related papers (2023-12-19T20:35:08Z) - SLOG: A Structural Generalization Benchmark for Semantic Parsing [68.19511282584304]
The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions.
Existing benchmarks often focus on lexical generalization, the interpretation of novel lexical items in syntactic structures familiar from training, are often underrepresented.
We introduce SLOG, a semantic parsing dataset that extends COGS with 17 structural generalization cases.
arXiv Detail & Related papers (2023-10-23T15:39:09Z) - Joint Dropout: Improving Generalizability in Low-Resource Neural Machine
Translation through Phrase Pair Variables [17.300004156754966]
We propose a method called Joint Dropout, that addresses the challenge of low-resource neural machine translation by substituting phrases with variables.
We observe a substantial improvement in translation quality for language pairs with minimal resources, as seen in BLEU and Direct Assessment scores.
arXiv Detail & Related papers (2023-07-24T14:33:49Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Categorizing Semantic Representations for Neural Machine Translation [53.88794787958174]
We introduce categorization to the source contextualized representations.
The main idea is to enhance generalization by reducing sparsity and overfitting.
Experiments on a dedicated MT dataset show that our method reduces compositional generalization error rates by 24% error reduction.
arXiv Detail & Related papers (2022-10-13T04:07:08Z) - Generating Authentic Adversarial Examples beyond Meaning-preserving with
Doubly Round-trip Translation [64.16077929617119]
We propose a new criterion for NMT adversarial examples based on the Doubly Round-Trip Translation (DRTT)
To enhance the robustness of the NMT model, we introduce the masked language models to construct bilingual adversarial pairs.
arXiv Detail & Related papers (2022-04-19T06:15:27Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences
on Neural Machine Translation [14.645468999921961]
We analyze the impact of different types of fine-grained semantic divergences on Transformer models.
We introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences.
arXiv Detail & Related papers (2021-05-31T16:15:35Z) - The Unreasonable Volatility of Neural Machine Translation Models [5.44772285850031]
We investigate the unexpected volatility of NMT models where the input is semantically and syntactically correct.
We find that both RNN and Transformer models display volatile behavior in 26% and 19% of sentence variations, respectively.
arXiv Detail & Related papers (2020-05-25T20:54:23Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.