Categorizing Semantic Representations for Neural Machine Translation
- URL: http://arxiv.org/abs/2210.06709v1
- Date: Thu, 13 Oct 2022 04:07:08 GMT
- Title: Categorizing Semantic Representations for Neural Machine Translation
- Authors: Yongjing Yin, Yafu Li, Fandong Meng, Jie Zhou, Yue Zhang
- Abstract summary: We introduce categorization to the source contextualized representations.
The main idea is to enhance generalization by reducing sparsity and overfitting.
Experiments on a dedicated MT dataset show that our method reduces compositional generalization error rates by 24% error reduction.
- Score: 53.88794787958174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern neural machine translation (NMT) models have achieved competitive
performance in standard benchmarks. However, they have recently been shown to
suffer limitation in compositional generalization, failing to effectively learn
the translation of atoms (e.g., words) and their semantic composition (e.g.,
modification) from seen compounds (e.g., phrases), and thus suffering from
significantly weakened translation performance on unseen compounds during
inference. We address this issue by introducing categorization to the source
contextualized representations. The main idea is to enhance generalization by
reducing sparsity and overfitting, which is achieved by finding prototypes of
token representations over the training set and integrating their embeddings
into the source encoding. Experiments on a dedicated MT dataset (i.e.,
CoGnition) show that our method reduces compositional generalization error
rates by 24\% error reduction. In addition, our conceptually simple method
gives consistently better results than the Transformer baseline on a range of
general MT datasets.
Related papers
- Recursive Neural Networks with Bottlenecks Diagnose
(Non-)Compositionality [65.60002535580298]
Quantifying compositionality of data is a challenging task, which has been investigated primarily for short utterances.
We show that comparing data's representations in models with and without a bottleneck can be used to produce a compositionality metric.
The procedure is applied to the evaluation of arithmetic expressions using synthetic data, and sentiment classification using natural language data.
arXiv Detail & Related papers (2023-01-31T15:46:39Z) - Translate First Reorder Later: Leveraging Monotonicity in Semantic
Parsing [4.396860522241306]
TPol is a two-step approach that translates input sentences monotonically and then reorders them to obtain the correct output.
We test our approach on two popular semantic parsing datasets.
arXiv Detail & Related papers (2022-10-10T17:50:42Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Grounded Graph Decoding Improves Compositional Generalization in
Question Answering [68.72605660152101]
Question answering models struggle to generalize to novel compositions of training patterns, such as longer sequences or more complex test structures.
We propose Grounded Graph Decoding, a method to improve compositional generalization of language representations by grounding structured predictions with an attention mechanism.
Our model significantly outperforms state-of-the-art baselines on the Compositional Freebase Questions (CFQ) dataset, a challenging benchmark for compositional generalization in question answering.
arXiv Detail & Related papers (2021-11-05T17:50:14Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - On Compositional Generalization of Neural Machine Translation [11.171958188127961]
We study NMT models from the perspective of compositional generalization.
We build a benchmark dataset, CoGnition, consisting of 216k clean and consistent sentence pairs.
We quantitatively analyze effects of various factors using compound translation error rate, then demonstrate that the NMT model fails badly on compositional generalization.
arXiv Detail & Related papers (2021-05-31T09:04:29Z) - Modeling Coverage for Non-Autoregressive Neural Machine Translation [9.173385214565451]
We propose a novel Coverage-NAT to model the coverage information directly by a token-level coverage iterative refinement mechanism and a sentence-level coverage agreement.
Experimental results on WMT14 En-De and WMT16 En-Ro translation tasks show that our method can alleviate those errors and achieve strong improvements over the baseline system.
arXiv Detail & Related papers (2021-04-24T07:33:23Z) - Neural Inverse Text Normalization [11.240669509034298]
We propose an efficient and robust neural solution for inverse text normalization.
We show that this can be easily extended to other languages without the need for a linguistic expert to manually curate them.
A transformer based model infused with pretraining consistently achieves a lower WER across several datasets.
arXiv Detail & Related papers (2021-02-12T07:53:53Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.