Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion
- URL: http://arxiv.org/abs/2503.12152v1
- Date: Sat, 15 Mar 2025 14:18:45 GMT
- Title: Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion
- Authors: Bin Liu, Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang,
- Abstract summary: We propose an enhanced approach by incorporating multiple sources of knowledge, including both the document summarization and entity translation.<n>Our approach achieves an average improvement of 0.8, 0.6, and 0.4 COMET scores over the baseline without extra knowledge.
- Score: 21.533772761328656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies in prompting large language model (LLM) for document-level machine translation (DMT) primarily focus on the inter-sentence context by flatting the source document into a long sequence. This approach relies solely on the sequence of sentences within the document. However, the complexity of document-level sequences is greater than that of shorter sentence-level sequences, which may limit LLM's ability in DMT when only this single-source knowledge is used. In this paper, we propose an enhanced approach by incorporating multiple sources of knowledge, including both the document summarization and entity translation, to enhance the performance of LLM-based DMT. Given a source document, we first obtain its summarization and translation of entities via LLM as the additional knowledge. We then utilize LLMs to generate two translations of the source document by fusing these two single knowledge sources, respectively. Finally, recognizing that different sources of knowledge may aid or hinder the translation of different sentences, we refine and rank the translations by leveraging a multi-knowledge fusion strategy to ensure the best results. Experimental results in eight document-level translation tasks show that our approach achieves an average improvement of 0.8, 0.6, and 0.4 COMET scores over the baseline without extra knowledge for LLaMA3-8B-Instruct, Mistral-Nemo-Instruct, and GPT-4o-mini, respectively.
Related papers
- Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement [19.513243503109035]
Large language models (LLMs) can enhance translation quality through self-refinement.
We build on this idea by extending the refinement from sentence-level to document-level translation.
Since sentence-to-sentence (Sent2Sent) and Doc2Doc translation address different aspects of the translation process, we propose fine-tuning LLMs for translation refinement.
arXiv Detail & Related papers (2025-04-08T02:08:07Z) - Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents [47.34053408385208]
We study a simple method for handling document-level machine translation, by leveraging previous contexts in a multi-turn conversational manner.<n>This method ensures coherent translations without additional training, and can fully re-use the KV cache of previous turns.<n>We empirically show this multi-turn method outperforms both translating entire documents in a single turn and translating each segment independently.
arXiv Detail & Related papers (2025-03-13T15:57:50Z) - Retrieval-Augmented Machine Translation with Unstructured Knowledge [74.84236945680503]
Retrieval-augmented generation (RAG) introduces additional information to enhance large language models (LLMs)<n>In machine translation (MT), previous work typically retrieves in-context examples from paired MT corpora, or domain-specific knowledge from knowledge graphs.<n>In this paper, we study retrieval-augmented MT using unstructured documents.
arXiv Detail & Related papers (2024-12-05T17:00:32Z) - Fine-Grained and Multi-Dimensional Metrics for Document-Level Machine Translation [15.987448306012167]
Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT)<n>This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT)
arXiv Detail & Related papers (2024-10-28T11:49:58Z) - Context-Aware or Context-Insensitive? Assessing LLMs' Performance in Document-Level Translation [10.174848090916669]
Large language models (LLMs) are increasingly strong contenders in machine translation.
We focus on document-level translation, where some words cannot be translated without context from outside the sentence.
arXiv Detail & Related papers (2024-10-18T11:52:10Z) - Cool-Fusion: Fuse Large Language Models without Training [73.17551121242602]
emphCool-Fusion is a method that does not require any type of training like the ensemble approaches.
emphCool-Fusion increases accuracy from three strong source LLMs by a significant 8%-17.8%.
arXiv Detail & Related papers (2024-07-29T09:02:19Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis [103.89753784762445]
Large language models (LLMs) have demonstrated remarkable potential in handling multilingual machine translation (MMT)
This paper systematically investigates the advantages and challenges of LLMs for MMT.
We thoroughly evaluate eight popular LLMs, including ChatGPT and GPT-4.
arXiv Detail & Related papers (2023-04-10T15:51:30Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.