Easy Guided Decoding in Providing Suggestions for Interactive Machine
Translation
- URL: http://arxiv.org/abs/2211.07093v2
- Date: Fri, 2 Jun 2023 08:25:20 GMT
- Title: Easy Guided Decoding in Providing Suggestions for Interactive Machine
Translation
- Authors: Ke Wang, Xin Ge, Jiayi Wang, Yu Zhao, Yuqi Zhang
- Abstract summary: We propose a novel constrained decoding algorithm, namely Prefix Suffix Guided Decoding (PSGD)
PSGD improves translation quality by an average of $10.87$ BLEU and $8.62$ BLEU on the WeTS and the WMT 2022 Translation Suggestion datasets.
- Score: 14.615314828955288
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine translation technology has made great progress in recent years, but
it cannot guarantee error free results. Human translators perform post editing
on machine translations to correct errors in the scene of computer aided
translation. In favor of expediting the post editing process, many works have
investigated machine translation in interactive modes, in which machines can
automatically refine the rest of translations constrained by human's edits.
Translation Suggestion (TS), as an interactive mode to assist human
translators, requires machines to generate alternatives for specific incorrect
words or phrases selected by human translators. In this paper, we utilize the
parameterized objective function of neural machine translation (NMT) and
propose a novel constrained decoding algorithm, namely Prefix Suffix Guided
Decoding (PSGD), to deal with the TS problem without additional training.
Compared to the state of the art lexically constrained decoding method, PSGD
improves translation quality by an average of $10.87$ BLEU and $8.62$ BLEU on
the WeTS and the WMT 2022 Translation Suggestion datasets, respectively, and
reduces decoding time overhead by an average of 63.4% tested on the WMT
translation datasets. Furthermore, on both of the TS benchmark datasets, it is
superior to other supervised learning systems trained with TS annotated data.
Related papers
- A Data Selection Approach for Enhancing Low Resource Machine Translation Using Cross-Lingual Sentence Representations [0.4499833362998489]
This study focuses on the case of English-Marathi language pairs, where existing datasets are notably noisy.
To mitigate the impact of data quality issues, we propose a data filtering approach based on cross-lingual sentence representations.
Results demonstrate a significant improvement in translation quality over the baseline post-filtering with IndicSBERT.
arXiv Detail & Related papers (2024-09-04T13:49:45Z) - An approach for mistranslation removal from popular dataset for Indic MT
Task [5.4755933832880865]
We propose an algorithm to remove mistranslations from the training corpus and evaluate its performance and efficiency.
Two Indic languages (ILs), namely, Hindi (HIN) and Odia (ODI) are chosen for the experiment.
The quality of the translations in the experiment is evaluated using standard metrics such as BLEU, METEOR, and RIBES.
arXiv Detail & Related papers (2024-01-12T06:37:19Z) - On the Copying Problem of Unsupervised NMT: A Training Schedule with a
Language Discriminator Loss [120.19360680963152]
unsupervised neural machine translation (UNMT) has achieved success in many language pairs.
The copying problem, i.e., directly copying some parts of the input sentence as the translation, is common among distant language pairs.
We propose a simple but effective training schedule that incorporates a language discriminator loss.
arXiv Detail & Related papers (2023-05-26T18:14:23Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - ParroT: Translating during Chat using Large Language Models tuned with
Human Translation and Feedback [90.20262941911027]
ParroT is a framework to enhance and regulate the translation abilities during chat.
Specifically, ParroT reformulates translation data into the instruction-following style.
We propose three instruction types for finetuning ParroT models, including translation instruction, contrastive instruction, and error-guided instruction.
arXiv Detail & Related papers (2023-04-05T13:12:00Z) - Non-Parametric Online Learning from Human Feedback for Neural Machine
Translation [54.96594148572804]
We study the problem of online learning with human feedback in the human-in-the-loop machine translation.
Previous methods require online model updating or additional translation memory networks to achieve high-quality performance.
We propose a novel non-parametric online learning method without changing the model structure.
arXiv Detail & Related papers (2021-09-23T04:26:15Z) - ChrEnTranslate: Cherokee-English Machine Translation Demo with Quality
Estimation and Corrective Feedback [70.5469946314539]
ChrEnTranslate is an online machine translation demonstration system for translation between English and an endangered language Cherokee.
It supports both statistical and neural translation models as well as provides quality estimation to inform users of reliability.
arXiv Detail & Related papers (2021-07-30T17:58:54Z) - Computer Assisted Translation with Neural Quality Estimation and
Automatic Post-Editing [18.192546537421673]
We propose an end-to-end deep learning framework of the quality estimation and automatic post-editing of the machine translation output.
Our goal is to provide error correction suggestions and to further relieve the burden of human translators through an interpretable model.
arXiv Detail & Related papers (2020-09-19T00:29:00Z) - It's Easier to Translate out of English than into it: Measuring Neural
Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty.
XMI exploits the probabilistic nature of most neural machine translation models.
We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.