Uncertainty-Aware Semantic Augmentation for Neural Machine Translation
- URL: http://arxiv.org/abs/2010.04411v1
- Date: Fri, 9 Oct 2020 07:48:09 GMT
- Title: Uncertainty-Aware Semantic Augmentation for Neural Machine Translation
- Authors: Xiangpeng Wei and Heng Yu and Yue Hu and Rongxiang Weng and Luxi Xing
and Weihua Luo
- Abstract summary: We propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences.
Our approach significantly outperforms the strong baselines and the existing methods.
- Score: 37.555675157198145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a sequence-to-sequence generation task, neural machine translation (NMT)
naturally contains intrinsic uncertainty, where a single sentence in one
language has multiple valid counterparts in the other. However, the dominant
methods for NMT only observe one of them from the parallel corpora for the
model training but have to deal with adequate variations under the same meaning
at inference. This leads to a discrepancy of the data distribution between the
training and the inference phases. To address this problem, we propose
uncertainty-aware semantic augmentation, which explicitly captures the
universal semantic information among multiple semantically-equivalent source
sentences and enhances the hidden representations with this information for
better translations. Extensive experiments on various translation tasks reveal
that our approach significantly outperforms the strong baselines and the
existing methods.
Related papers
- Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective [72.83966378613238]
Under-translation and over-translation remain two challenging problems in state-of-the-art Neural Machine Translation (NMT) systems.
We conduct an in-depth analysis on the underlying cause of under-translation in NMT, providing an explanation from the perspective of decoding objective.
We propose employing the confidence of predicting End Of Sentence (EOS) as a detector for under-translation, and strengthening the confidence-based penalty to penalize candidates with a high risk of under-translation.
arXiv Detail & Related papers (2024-05-29T09:25:49Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Progressive Translation: Improving Domain Robustness of Neural Machine
Translation with Intermediate Sequences [37.71415679778235]
We propose intermediate signals which are intermediate sequences from the "source-like" structure to the "target-like" structure.
Such intermediate sequences introduce an inductive bias that reflects a domain-agnostic principle of translation.
Experiments show that the introduced intermediate signals can effectively improve the domain robustness of NMT.
arXiv Detail & Related papers (2023-05-16T04:15:25Z) - Understanding and Bridging the Modality Gap for Speech Translation [11.13240570688547]
Multi-task learning is one of the effective ways to share knowledge between machine translation (MT) and end-to-end speech translation (ST)
However, due to the differences between speech and text, there is always a gap between ST and MT.
In this paper, we first aim to understand this modality gap from the target-side representation differences, and link the modality gap to another well-known problem in neural machine translation: exposure bias.
arXiv Detail & Related papers (2023-05-15T15:09:18Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Bridging the Data Gap between Training and Inference for Unsupervised
Neural Machine Translation [49.916963624249355]
A UNMT model is trained on the pseudo parallel data with translated source, and natural source sentences in inference.
The source discrepancy between training and inference hinders the translation performance of UNMT models.
We propose an online self-training approach, which simultaneously uses the pseudo parallel data natural source, translated target to mimic the inference scenario.
arXiv Detail & Related papers (2022-03-16T04:50:27Z) - Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural
Machine Translation Training [58.72619374790418]
MultiUAT dynamically adjusts the training data usage based on the model's uncertainty.
We analyze the cross-domain transfer and show the deficiency of static and similarity based methods.
arXiv Detail & Related papers (2021-09-06T08:30:33Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Beyond Noise: Mitigating the Impact of Fine-grained Semantic Divergences
on Neural Machine Translation [14.645468999921961]
We analyze the impact of different types of fine-grained semantic divergences on Transformer models.
We introduce a divergent-aware NMT framework that uses factors to help NMT recover from the degradation caused by naturally occurring divergences.
arXiv Detail & Related papers (2021-05-31T16:15:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.