(Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
- URL: http://arxiv.org/abs/2405.11804v1
- Date: Mon, 20 May 2024 05:55:08 GMT
- Title: (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts
- Authors: Minghao Wu, Yulin Yuan, Gholamreza Haffari, Longyue Wang,
- Abstract summary: We introduce a novel multi-agent framework based on large language models (LLMs) for literary translation, implemented as a company called TransAgents.
To evaluate the effectiveness of our system, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP)
- Score: 52.18246881218829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in machine translation (MT) have significantly enhanced translation quality across various domains. However, the translation of literary texts remains a formidable challenge due to their complex language, figurative expressions, and cultural nuances. In this work, we introduce a novel multi-agent framework based on large language models (LLMs) for literary translation, implemented as a company called TransAgents, which mirrors traditional translation publication process by leveraging the collective capabilities of multiple agents, to address the intricate demands of translating literary works. To evaluate the effectiveness of our system, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP). MHP assesses translations from the perspective of monolingual readers of the target language, while BLP uses advanced LLMs to compare translations directly with the original texts. Empirical findings indicate that despite lower d-BLEU scores, translations from TransAgents are preferred by both human evaluators and LLMs over human-written references, particularly in genres requiring domain-specific knowledge. We also highlight the strengths and limitations of TransAgents through case studies and suggests directions for future research.
Related papers
- How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs [23.247387152595067]
LITEVAL-CORPUS is a parallel corpus comprising multiple verified human translations and outputs from 9 machine translation systems.
We find that Multidimensional Quality Metrics (MQM), as the de facto standard in non-literary human MT evaluation, is inadequate for literary translation.
arXiv Detail & Related papers (2024-10-24T12:48:03Z) - LLM-based Translation Inference with Iterative Bilingual Understanding [45.00660558229326]
We propose a novel Iterative Bilingual Understanding Translation method based on the cross-lingual capabilities of large language models (LLMs)
The cross-lingual capability of LLMs enables the generation of contextual understanding for both the source and target languages separately.
The proposed IBUT outperforms several strong comparison methods.
arXiv Detail & Related papers (2024-10-16T13:21:46Z) - DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory [96.35468670508476]
We introduce DelTA, a Document-levEL Translation Agent for large language models (LLMs)
DelTA features a multi-level memory structure that stores information across various granularities and spans.
Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality.
arXiv Detail & Related papers (2024-10-10T17:30:09Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - Towards Effective Disambiguation for Machine Translation with Large
Language Models [65.80775710657672]
We study the capabilities of large language models to translate "ambiguous sentences"
Experiments show that our methods can match or outperform state-of-the-art systems such as DeepL and NLLB in four out of five language directions.
arXiv Detail & Related papers (2023-09-20T22:22:52Z) - Exploring Human-Like Translation Strategy with Large Language Models [93.49333173279508]
Large language models (LLMs) have demonstrated impressive capabilities in general scenarios.
This work proposes the MAPS framework, which stands for Multi-Aspect Prompting and Selection.
We employ a selection mechanism based on quality estimation to filter out noisy and unhelpful knowledge.
arXiv Detail & Related papers (2023-05-06T19:03:12Z) - Large language models effectively leverage document-level context for
literary translation, but critical errors persist [32.54546652197316]
Large language models (LLMs) are competitive with the state of the art on a wide range of sentence-level translation datasets.
We show through a rigorous human evaluation that asking the Gpt-3.5 (text-davinci-003) LLM to translate an entire literary paragraph results in higher-quality translations.
arXiv Detail & Related papers (2023-04-06T17:27:45Z) - Exploring Document-Level Literary Machine Translation with Parallel
Paragraphs from World Literature [35.1398797683712]
We show that literary translators prefer reference human translations over machine-translated paragraphs at a rate of 84%.
We train a post-editing model whose output is preferred over normal MT output at a rate of 69% by experts.
arXiv Detail & Related papers (2022-10-25T18:03:34Z) - On the Limitations of Cross-lingual Encoders as Exposed by
Reference-Free Machine Translation Evaluation [55.02832094101173]
Evaluation of cross-lingual encoders is usually performed either via zero-shot cross-lingual transfer in supervised downstream tasks or via unsupervised cross-lingual similarity.
This paper concerns ourselves with reference-free machine translation (MT) evaluation where we directly compare source texts to (sometimes low-quality) system translations.
We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER.
We find that they perform poorly as semantic encoders for reference-free MT evaluation and identify their two key limitations.
arXiv Detail & Related papers (2020-05-03T22:10:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.