Leveraging GPT-4 for Automatic Translation Post-Editing
- URL: http://arxiv.org/abs/2305.14878v2
- Date: Mon, 23 Oct 2023 23:18:18 GMT
- Title: Leveraging GPT-4 for Automatic Translation Post-Editing
- Authors: Vikas Raunak, Amr Sharaf, Yiren Wang, Hany Hassan Awadallah, Arul
Menezes
- Abstract summary: GPT-4 is adept at translation post-editing, producing meaningful and trustworthy edits to translations.
We improve upon state-of-the-art performance on WMT-22 English-Chinese, English-German, Chinese-English and German-English language pairs using GPT-4 based post-editing.
- Score: 23.65958978995292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While Neural Machine Translation (NMT) represents the leading approach to
Machine Translation (MT), the outputs of NMT models still require translation
post-editing to rectify errors and enhance quality under critical settings. In
this work, we formalize the task of direct translation post-editing with Large
Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit
NMT outputs across several language pairs. Our results demonstrate that GPT-4
is adept at translation post-editing, producing meaningful and trustworthy
edits to translations that help improve its general quality as well as remove
different classes of major errors in translations. In particular, human
evaluations on assessing edit trustworthiness show that GPT-4 exhibits a large
improvement over the prior state-of-the-art LLM. Notably, we improve upon
state-of-the-art performance on WMT-22 English-Chinese, English-German,
Chinese-English and German-English language pairs using GPT-4 based
post-editing, as evaluated by state-of-the-art MT quality metrics. However, we
also show that GPT-4 could produce hallucinated edits, thereby urging caution
in its use as an expert translation post-editor.
Related papers
- GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels [18.835573312027265]
This study comprehensively evaluates the translation quality of Large Language Models (LLMs) against human translators.
We find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators.
arXiv Detail & Related papers (2024-07-04T05:58:04Z) - Prompting Large Language Models with Human Error Markings for Self-Correcting Machine Translation [11.351365352611658]
Post-editing (PE) is still required to correct errors and to enhance term translation quality in specialized domains.
We present a pilot study of enhancing translation memories (TM) for the needs of correct and consistent term translation in technical domains.
arXiv Detail & Related papers (2024-06-04T12:43:47Z) - Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations [14.149224539732913]
Machine Translation remains one of the last NLP tasks where large language models (LLMs) have not yet replaced dedicated supervised systems.
This work exploits the complementary strengths of LLMs and supervised MT by guiding LLMs to automatically post-edit MT with external feedback on its quality.
Experiments on Chinese-English, English-German, and English-Russian MQM data, we demonstrate that prompting LLMs to post-edit MT improves TER, BLEU and COMET scores.
Fine-tuning helps integrate fine-grained feedback more effectively and further improves translation quality based on both automatic and human evaluation.
arXiv Detail & Related papers (2024-04-11T15:47:10Z) - Improving Cross-Domain Low-Resource Text Generation through LLM
Post-Editing: A Programmer-Interpreter Approach [50.400999859808984]
Post-editing has proven effective in improving the quality of text generated by large language models (LLMs)
We propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output.
Experiments demonstrate that the programmer-interpreter significantly enhances GPT-3.5's performance in logical form-to-text conversion and low-resource machine translation.
arXiv Detail & Related papers (2024-02-07T06:13:14Z) - Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation [50.00235162432848]
We train ALMA models with only 22K parallel sentences and 12M parameters.
The resulting model, called ALMA-R, can match or exceed the performance of the WMT competition winners and GPT-4.
arXiv Detail & Related papers (2024-01-16T15:04:51Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - ParroT: Translating during Chat using Large Language Models tuned with
Human Translation and Feedback [90.20262941911027]
ParroT is a framework to enhance and regulate the translation abilities during chat.
Specifically, ParroT reformulates translation data into the instruction-following style.
We propose three instruction types for finetuning ParroT models, including translation instruction, contrastive instruction, and error-guided instruction.
arXiv Detail & Related papers (2023-04-05T13:12:00Z) - Document-Level Machine Translation with Large Language Models [91.03359121149595]
Large language models (LLMs) can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks.
This paper provides an in-depth evaluation of LLMs' ability on discourse modeling.
arXiv Detail & Related papers (2023-04-05T03:49:06Z) - DivEMT: Neural Machine Translation Post-Editing Effort Across
Typologically Diverse Languages [5.367993194110256]
DivEMT is the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages.
We assess the impact on translation productivity of two state-of-the-art NMT systems, namely: Google Translate and the open-source multilingual model mBART50.
arXiv Detail & Related papers (2022-05-24T17:22:52Z) - SJTU-NICT's Supervised and Unsupervised Neural Machine Translation
Systems for the WMT20 News Translation Task [111.91077204077817]
We participated in four translation directions of three language pairs: English-Chinese, English-Polish, and German-Upper Sorbian.
Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques.
In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
arXiv Detail & Related papers (2020-10-11T00:40:05Z) - Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency.
We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT.
The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.