Multilingual Contextualization of Large Language Models for Document-Level Machine Translation
- URL: http://arxiv.org/abs/2504.12140v1
- Date: Wed, 16 Apr 2025 14:52:22 GMT
- Title: Multilingual Contextualization of Large Language Models for Document-Level Machine Translation
- Authors: Miguel Moura Ramos, Patrick Fernandes, Sweta Agrawal, André F. T. Martins,
- Abstract summary: Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation.<n>We propose a method to improve LLM-based long-document translation through targeted fine-tuning on high-quality document-level data.<n>Our approach supports multiple translation paradigms, including direct document-to-document and chunk-level translation.
- Score: 30.005159724115824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation, but scaling to document-level translation remains challenging, particularly in modeling long-range dependencies and discourse phenomena across sentences and paragraphs. In this work, we propose a method to improve LLM-based long-document translation through targeted fine-tuning on high-quality document-level data, which we curate and introduce as DocBlocks. Our approach supports multiple translation paradigms, including direct document-to-document and chunk-level translation, by integrating instructions both with and without surrounding context. This enables models to better capture cross-sentence dependencies while maintaining strong sentence-level translation performance. Experimental results show that incorporating multiple translation paradigms improves document-level translation quality and inference speed compared to prompting and agent-based methods.
Related papers
- Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement [19.513243503109035]
Large language models (LLMs) can enhance translation quality through self-refinement.<n>We build on this idea by extending the refinement from sentence-level to document-level translation.<n>Since sentence-to-sentence (Sent2Sent) and Doc2Doc translation address different aspects of the translation process, we propose fine-tuning LLMs for translation refinement.
arXiv Detail & Related papers (2025-04-08T02:08:07Z) - Speech Translation Refinement using Large Language Models [8.602429274223693]
This paper investigates how large language models (LLMs) can improve the performance of speech translation by introducing a joint refinement process.<n>Through the joint refinement of speech translation (ST) and automatic speech recognition (ASR) transcription via LLMs, the performance of the ST model is significantly improved.<n> Experimental results on the MuST-C and CoVoST 2 datasets, which include seven translation tasks, demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2025-01-25T05:32:42Z) - Context-Aware or Context-Insensitive? Assessing LLMs' Performance in Document-Level Translation [10.174848090916669]
Large language models (LLMs) are increasingly strong contenders in machine translation.
We focus on document-level translation, where some words cannot be translated without context from outside the sentence.
arXiv Detail & Related papers (2024-10-18T11:52:10Z) - DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory [96.35468670508476]
We introduce DelTA, a Document-levEL Translation Agent for large language models (LLMs)<n>DelTA features a multi-level memory structure that stores information across various granularities and spans.<n> Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality.
arXiv Detail & Related papers (2024-10-10T17:30:09Z) - Enhancing Document-level Translation of Large Language Model via
Translation Mixed-instructions [24.025242477280983]
Existing large language models (LLMs) for machine translation are typically fine-tuned on sentence-level translation instructions.
This challenge arises from the issue of sentence-level coverage, where subsequent sentences in the document remain untranslated.
We propose an approach that combines sentence-level and document-level translation instructions of varying lengths to fine-tune LLMs.
arXiv Detail & Related papers (2024-01-16T03:28:26Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - On Search Strategies for Document-Level Neural Machine Translation [51.359400776242786]
Document-level neural machine translation (NMT) models produce a more consistent output across a document.
In this work, we aim to answer the question how to best utilize a context-aware translation model in decoding.
arXiv Detail & Related papers (2023-06-08T11:30:43Z) - HanoiT: Enhancing Context-aware Translation via Selective Context [95.93730812799798]
Context-aware neural machine translation aims to use the document-level context to improve translation quality.
The irrelevant or trivial words may bring some noise and distract the model from learning the relationship between the current sentence and the auxiliary context.
We propose a novel end-to-end encoder-decoder model with a layer-wise selection mechanism to sift and refine the long document context.
arXiv Detail & Related papers (2023-01-17T12:07:13Z) - Modeling Context With Linear Attention for Scalable Document-Level
Translation [72.41955536834702]
We investigate the efficacy of a recent linear attention model on document translation and augment it with a sentential gate to promote a recency inductive bias.
We show that sentential gating further improves translation quality on IWSLT.
arXiv Detail & Related papers (2022-10-16T03:41:50Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z) - Towards Making the Most of Context in Neural Machine Translation [112.9845226123306]
We argue that previous research did not make a clear use of the global context.
We propose a new document-level NMT framework that deliberately models the local context of each sentence.
arXiv Detail & Related papers (2020-02-19T03:30:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.