Neural Machine Translation with Error Correction
- URL: http://arxiv.org/abs/2007.10681v1
- Date: Tue, 21 Jul 2020 09:41:07 GMT
- Title: Neural Machine Translation with Error Correction
- Authors: Kaitao Song, Xu Tan and Jianfeng Lu
- Abstract summary: We introduce an error correction mechanism into NMT, which corrects the error information in the previous generated tokens to better predict the next token.
Specifically, we introduce two-stream self-attention from XLNet into NMT decoder, where the query stream is used to predict the next token.
We leverage scheduled sampling to simulate the prediction errors during training.
- Score: 40.61399972202611
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural machine translation (NMT) generates the next target token given as
input the previous ground truth target tokens during training while the
previous generated target tokens during inference, which causes discrepancy
between training and inference as well as error propagation, and affects the
translation accuracy. In this paper, we introduce an error correction mechanism
into NMT, which corrects the error information in the previous generated tokens
to better predict the next token. Specifically, we introduce two-stream
self-attention from XLNet into NMT decoder, where the query stream is used to
predict the next token, and meanwhile the content stream is used to correct the
error information from the previous predicted tokens. We leverage scheduled
sampling to simulate the prediction errors during training. Experiments on
three IWSLT translation datasets and two WMT translation datasets demonstrate
that our method achieves improvements over Transformer baseline and scheduled
sampling. Further experimental analyses also verify the effectiveness of our
proposed error correction mechanism to improve the translation quality.
Related papers
- A Coin Has Two Sides: A Novel Detector-Corrector Framework for Chinese Spelling Correction [79.52464132360618]
Chinese Spelling Correction (CSC) stands as a foundational Natural Language Processing (NLP) task.
We introduce a novel approach based on error detector-corrector framework.
Our detector is designed to yield two error detection results, each characterized by high precision and recall.
arXiv Detail & Related papers (2024-09-06T09:26:45Z) - Enhancing Supervised Learning with Contrastive Markings in Neural
Machine Translation Training [10.498938255717066]
Supervised learning in Neural Machine Translation (NMT) typically follows a teacher forcing paradigm.
We present a simple extension of standard maximum likelihood estimation by a contrastive marking objective.
We show that training with contrastive markings yields improvements on top of supervised learning.
arXiv Detail & Related papers (2023-07-17T11:56:32Z) - Towards Fine-Grained Information: Identifying the Type and Location of
Translation Errors [80.22825549235556]
Existing approaches can not synchronously consider error position and type.
We build an FG-TED model to predict the textbf addition and textbfomission errors.
Experiments show that our model can identify both error type and position concurrently, and gives state-of-the-art results.
arXiv Detail & Related papers (2023-02-17T16:20:33Z) - Towards Opening the Black Box of Neural Machine Translation: Source and
Target Interpretations of the Transformer [1.8594711725515678]
In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix.
Previous work on interpretability in NMT has focused solely on source sentence tokens attributions.
We propose an interpretability method that tracks complete input token attributions.
arXiv Detail & Related papers (2022-05-23T20:59:14Z) - Understanding and Improving Sequence-to-Sequence Pretraining for Neural
Machine Translation [48.50842995206353]
We study the impact of the jointly pretrained decoder, which is the main difference between Seq2Seq pretraining and previous encoder-based pretraining approaches for NMT.
We propose simple and effective strategies, named in-domain pretraining and input adaptation to remedy the domain and objective discrepancies.
arXiv Detail & Related papers (2022-03-16T07:36:28Z) - Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese
Grammatical Error Correction [49.25830718574892]
We present a new framework named Tail-to-Tail (textbfTtT) non-autoregressive sequence prediction.
Considering that most tokens are correct and can be conveyed directly from source to target, and the error positions can be estimated and corrected.
Experimental results on standard datasets, especially on the variable-length datasets, demonstrate the effectiveness of TtT in terms of sentence-level Accuracy, Precision, Recall, and F1-Measure.
arXiv Detail & Related papers (2021-06-03T05:56:57Z) - On the Inference Calibration of Neural Machine Translation [54.48932804996506]
We study the correlation between calibration and translation performance and linguistic properties of miscalibration.
We propose a new graduated label smoothing method that can improve both inference calibration and translation performance.
arXiv Detail & Related papers (2020-05-03T02:03:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.