Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents
- URL: http://arxiv.org/abs/2503.10494v1
- Date: Thu, 13 Mar 2025 15:57:50 GMT
- Title: Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents
- Authors: Hanxu Hu, Jannis Vamvas, Rico Sennrich,
- Abstract summary: We study a simple method for handling document-level machine translation, by leveraging previous contexts in a multi-turn conversational manner.<n>This method ensures coherent translations without additional training, and can fully re-use the KV cache of previous turns.<n>We empirically show this multi-turn method outperforms both translating entire documents in a single turn and translating each segment independently.
- Score: 47.34053408385208
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: LLMs have paved the way for truly simple document-level machine translation, but challenges such as omission errors remain. In this paper, we study a simple method for handling document-level machine translation, by leveraging previous contexts in a multi-turn conversational manner. Specifically, by decomposing documents into segments and iteratively translating them while maintaining previous turns, this method ensures coherent translations without additional training, and can fully re-use the KV cache of previous turns thus minimizing computational overhead. We further propose a `source-primed' method that first provides the whole source document before multi-turn translation. We empirically show this multi-turn method outperforms both translating entire documents in a single turn and translating each segment independently according to multiple automatic metrics in representative LLMs, establishing a strong baseline for document-level translation using LLMs.
Related papers
- Multilingual Contextualization of Large Language Models for Document-Level Machine Translation [30.005159724115824]
Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation.
We propose a method to improve LLM-based long-document translation through targeted fine-tuning on high-quality document-level data.
Our approach supports multiple translation paradigms, including direct document-to-document and chunk-level translation.
arXiv Detail & Related papers (2025-04-16T14:52:22Z) - Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion [21.533772761328656]
We propose an enhanced approach by incorporating multiple sources of knowledge, including both the document summarization and entity translation.
Our approach achieves an average improvement of 0.8, 0.6, and 0.4 COMET scores over the baseline without extra knowledge.
arXiv Detail & Related papers (2025-03-15T14:18:45Z) - Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye [15.987448306012167]
Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT)
This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT)
arXiv Detail & Related papers (2024-10-28T11:49:58Z) - Context-Aware or Context-Insensitive? Assessing LLMs' Performance in Document-Level Translation [10.174848090916669]
Large language models (LLMs) are increasingly strong contenders in machine translation.<n>We focus on document-level translation, where some words cannot be translated without context from outside the sentence.
arXiv Detail & Related papers (2024-10-18T11:52:10Z) - DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory [96.35468670508476]
We introduce DelTA, a Document-levEL Translation Agent for large language models (LLMs)<n>DelTA features a multi-level memory structure that stores information across various granularities and spans.<n> Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality.
arXiv Detail & Related papers (2024-10-10T17:30:09Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - Principled Paraphrase Generation with Parallel Corpora [52.78059089341062]
We formalize the implicit similarity function induced by round-trip Machine Translation.
We show that it is susceptible to non-paraphrase pairs sharing a single ambiguous translation.
We design an alternative similarity metric that mitigates this issue.
arXiv Detail & Related papers (2022-05-24T17:22:42Z) - Multilingual Machine Translation Systems from Microsoft for WMT21 Shared
Task [95.06453182273027]
This report describes Microsoft's machine translation systems for the WMT21 shared task on large-scale multilingual machine translation.
Our model submissions to the shared task were with DeltaLMnotefooturlhttps://aka.ms/deltalm, a generic pre-trained multilingual-decoder model.
Our final submissions ranked first on three tracks in terms of the automatic evaluation metric.
arXiv Detail & Related papers (2021-11-03T09:16:17Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Learning Contextualized Sentence Representations for Document-Level
Neural Machine Translation [59.191079800436114]
Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence.
We propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence.
arXiv Detail & Related papers (2020-03-30T03:38:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.