Progressive Translation: Improving Domain Robustness of Neural Machine
Translation with Intermediate Sequences
- URL: http://arxiv.org/abs/2305.09154v1
- Date: Tue, 16 May 2023 04:15:25 GMT
- Title: Progressive Translation: Improving Domain Robustness of Neural Machine
Translation with Intermediate Sequences
- Authors: Chaojun Wang, Yang Liu, Wai Lam
- Abstract summary: We propose intermediate signals which are intermediate sequences from the "source-like" structure to the "target-like" structure.
Such intermediate sequences introduce an inductive bias that reflects a domain-agnostic principle of translation.
Experiments show that the introduced intermediate signals can effectively improve the domain robustness of NMT.
- Score: 37.71415679778235
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous studies show that intermediate supervision signals benefit various
Natural Language Processing tasks. However, it is not clear whether there exist
intermediate signals that benefit Neural Machine Translation (NMT). Borrowing
techniques from Statistical Machine Translation, we propose intermediate
signals which are intermediate sequences from the "source-like" structure to
the "target-like" structure. Such intermediate sequences introduce an inductive
bias that reflects a domain-agnostic principle of translation, which reduces
spurious correlations that are harmful to out-of-domain generalisation.
Furthermore, we introduce a full-permutation multi-task learning to alleviate
the spurious causal relations from intermediate sequences to the target, which
results from exposure bias. The Minimum Bayes Risk decoding algorithm is used
to pick the best candidate translation from all permutations to further improve
the performance. Experiments show that the introduced intermediate signals can
effectively improve the domain robustness of NMT and reduces the amount of
hallucinations on out-of-domain translation. Further analysis shows that our
methods are especially promising in low-resource scenarios.
Related papers
- Non-Parametric Domain Adaptation for End-to-End Speech Translation [72.37869362559212]
End-to-End Speech Translation (E2E-ST) has received increasing attention due to the potential of its less error propagation, lower latency, and fewer parameters.
We propose a novel non-parametric method that leverages domain-specific text translation corpus to achieve domain adaptation for the E2E-ST system.
arXiv Detail & Related papers (2022-05-23T11:41:02Z) - Understanding and Improving Sequence-to-Sequence Pretraining for Neural
Machine Translation [48.50842995206353]
We study the impact of the jointly pretrained decoder, which is the main difference between Seq2Seq pretraining and previous encoder-based pretraining approaches for NMT.
We propose simple and effective strategies, named in-domain pretraining and input adaptation to remedy the domain and objective discrepancies.
arXiv Detail & Related papers (2022-03-16T07:36:28Z) - Phrase-level Adversarial Example Generation for Neural Machine
Translation [75.01476479100569]
We propose a phrase-level adversarial example generation (PAEG) method to enhance the robustness of the model.
We verify our method on three benchmarks, including LDC Chinese-English, IWSLT14 German-English, and WMT14 English-German tasks.
arXiv Detail & Related papers (2022-01-06T11:00:49Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Uncertainty-Aware Semantic Augmentation for Neural Machine Translation [37.555675157198145]
We propose uncertainty-aware semantic augmentation, which explicitly captures the universal semantic information among multiple semantically-equivalent source sentences.
Our approach significantly outperforms the strong baselines and the existing methods.
arXiv Detail & Related papers (2020-10-09T07:48:09Z) - Learning Source Phrase Representations for Neural Machine Translation [65.94387047871648]
We propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations.
In our experiments, we obtain significant improvements on the WMT 14 English-German and English-French tasks on top of the strong Transformer baseline.
arXiv Detail & Related papers (2020-06-25T13:43:11Z) - Study of Diffusion Normalized Least Mean M-estimate Algorithms [0.8749675983608171]
This work proposes diffusion normalized least mean M-estimate algorithm based on the modified Huber function.
We analyze the transient, steady-state and stability behaviors of the algorithms in a unified framework.
Simulations in various impulsive noise scenarios show that the proposed algorithms are superior to some existing diffusion algorithms.
arXiv Detail & Related papers (2020-04-20T00:28:41Z) - Robust Unsupervised Neural Machine Translation with Adversarial
Denoising Training [66.39561682517741]
Unsupervised neural machine translation (UNMT) has attracted great interest in the machine translation community.
The main advantage of the UNMT lies in its easy collection of required large training text sentences.
In this paper, we first time explicitly take the noisy data into consideration to improve the robustness of the UNMT based systems.
arXiv Detail & Related papers (2020-02-28T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.