Related papers: Leveraging GPT-4 for Automatic Translation Post-Editing

Leveraging GPT-4 for Automatic Translation Post-Editing

URL: http://arxiv.org/abs/2305.14878v2
Date: Mon, 23 Oct 2023 23:18:18 GMT
Title: Leveraging GPT-4 for Automatic Translation Post-Editing
Authors: Vikas Raunak, Amr Sharaf, Yiren Wang, Hany Hassan Awadallah, Arul Menezes
Abstract summary: GPT-4 is adept at translation post-editing, producing meaningful and trustworthy edits to translations. We improve upon state-of-the-art performance on WMT-22 English-Chinese, English-German, Chinese-English and German-English language pairs using GPT-4 based post-editing.
Score: 23.65958978995292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Neural Machine Translation (NMT) represents the leading approach to Machine Translation (MT), the outputs of NMT models still require translation post-editing to rectify errors and enhance quality under critical settings. In this work, we formalize the task of direct translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs across several language pairs. Our results demonstrate that GPT-4 is adept at translation post-editing, producing meaningful and trustworthy edits to translations that help improve its general quality as well as remove different classes of major errors in translations. In particular, human evaluations on assessing edit trustworthiness show that GPT-4 exhibits a large improvement over the prior state-of-the-art LLM. Notably, we improve upon state-of-the-art performance on WMT-22 English-Chinese, English-German, Chinese-English and German-English language pairs using GPT-4 based post-editing, as evaluated by state-of-the-art MT quality metrics. However, we also show that GPT-4 could produce hallucinated edits, thereby urging caution in its use as an expert translation post-editor.

Related papers

Lost in Literalism: How Supervised Training Shapes Translationese in LLMs [51.04435855143767]
Large language models (LLMs) have achieved remarkable success in machine translation. However, translationese, characterized by overly literal and unnatural translations, remains a persistent challenge. We introduce methods to mitigate these biases, including polishing golden references and filtering unnatural training instances.
arXiv Detail & Related papers (2025-03-06T12:14:45Z)
Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels [20.05501751993599]
GPT-4 achieves performance comparable to junior-level translators in terms of total errors. Unlike traditional Neural Machine Translation systems, GPT-4 maintains consistent translation quality across all evaluated language pairs.
arXiv Detail & Related papers (2024-11-21T01:12:46Z)
Fine-Grained and Multi-Dimensional Metrics for Document-Level Machine Translation [15.987448306012167]
Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT) This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT)
arXiv Detail & Related papers (2024-10-28T11:49:58Z)
GPT-4 vs. Human Translators: A Comprehensive Evaluation of Translation Quality Across Languages, Domains, and Expertise Levels [18.835573312027265]
This study comprehensively evaluates the translation quality of Large Language Models (LLMs) against human translators. We find that GPT-4 performs comparably to junior translators in terms of total errors made but lags behind medium and senior translators.
arXiv Detail & Related papers (2024-07-04T05:58:04Z)
Prompting Large Language Models with Human Error Markings for Self-Correcting Machine Translation [11.351365352611658]
Post-editing (PE) is still required to correct errors and to enhance term translation quality in specialized domains. We present a pilot study of enhancing translation memories (TM) for the needs of correct and consistent term translation in technical domains.
arXiv Detail & Related papers (2024-06-04T12:43:47Z)
Guiding Large Language Models to Post-Edit Machine Translation with Error Annotations [14.149224539732913]
Machine Translation remains one of the last NLP tasks where large language models (LLMs) have not yet replaced dedicated supervised systems. This work exploits the complementary strengths of LLMs and supervised MT by guiding LLMs to automatically post-edit MT with external feedback on its quality. Experiments on Chinese-English, English-German, and English-Russian MQM data, we demonstrate that prompting LLMs to post-edit MT improves TER, BLEU and COMET scores. Fine-tuning helps integrate fine-grained feedback more effectively and further improves translation quality based on both automatic and human evaluation.
arXiv Detail & Related papers (2024-04-11T15:47:10Z)
Improving Cross-Domain Low-Resource Text Generation through LLM Post-Editing: A Programmer-Interpreter Approach [50.400999859808984]
Post-editing has proven effective in improving the quality of text generated by large language models (LLMs) We propose a neural programmer-interpreter approach that preserves the domain generalization ability of LLMs when editing their output. Experiments demonstrate that the programmer-interpreter significantly enhances GPT-3.5's performance in logical form-to-text conversion and low-resource machine translation.
arXiv Detail & Related papers (2024-02-07T06:13:14Z)
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation [50.00235162432848]
We train ALMA models with only 22K parallel sentences and 12M parameters. The resulting model, called ALMA-R, can match or exceed the performance of the WMT competition winners and GPT-4.
arXiv Detail & Related papers (2024-01-16T15:04:51Z)
ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback [90.20262941911027]
ParroT is a framework to enhance and regulate the translation abilities during chat. Specifically, ParroT reformulates translation data into the instruction-following style. We propose three instruction types for finetuning ParroT models, including translation instruction, contrastive instruction, and error-guided instruction.
arXiv Detail & Related papers (2023-04-05T13:12:00Z)
Document-Level Machine Translation with Large Language Models [91.03359121149595]
Large language models (LLMs) can produce coherent, cohesive, relevant, and fluent answers for various natural language processing (NLP) tasks. This paper provides an in-depth evaluation of LLMs' ability on discourse modeling.
arXiv Detail & Related papers (2023-04-05T03:49:06Z)
DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages [5.367993194110256]
DivEMT is the first publicly available post-editing study of Neural Machine Translation (NMT) over a typologically diverse set of target languages. We assess the impact on translation productivity of two state-of-the-art NMT systems, namely: Google Translate and the open-source multilingual model mBART50.
arXiv Detail & Related papers (2022-05-24T17:22:52Z)
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task [111.91077204077817]
We participated in four translation directions of three language pairs: English-Chinese, English-Polish, and German-Upper Sorbian. Based on different conditions of language pairs, we have experimented with diverse neural machine translation (NMT) techniques. In our submissions, the primary systems won the first place on English to Chinese, Polish to English, and German to Upper Sorbian translation directions.
arXiv Detail & Related papers (2020-10-11T00:40:05Z)
Explicit Reordering for Neural Machine Translation [50.70683739103066]
In Transformer-based neural machine translation (NMT), the positional encoding mechanism helps the self-attention networks to learn the source representation with order dependency. We propose a novel reordering method to explicitly model this reordering information for the Transformer-based NMT. The empirical results on the WMT14 English-to-German, WAT ASPEC Japanese-to-English, and WMT17 Chinese-to-English translation tasks show the effectiveness of the proposed approach.
arXiv Detail & Related papers (2020-04-08T05:28:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.