DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
- URL: http://arxiv.org/abs/2410.08143v1
- Date: Thu, 10 Oct 2024 17:30:09 GMT
- Title: DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory
- Authors: Yutong Wang, Jiali Zeng, Xuebo Liu, Derek F. Wong, Fandong Meng, Jie Zhou, Min Zhang,
- Abstract summary: We introduce DelTA, a Document-levEL Translation Agent for large language models (LLMs)
DelTA features a multi-level memory structure that stores information across various granularities and spans.
Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality.
- Score: 96.35468670508476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have achieved reasonable quality improvements in machine translation (MT). However, most current research on MT-LLMs still faces significant challenges in maintaining translation consistency and accuracy when processing entire documents. In this paper, we introduce DelTA, a Document-levEL Translation Agent designed to overcome these limitations. DelTA features a multi-level memory structure that stores information across various granularities and spans, including Proper Noun Records, Bilingual Summary, Long-Term Memory, and Short-Term Memory, which are continuously retrieved and updated by auxiliary LLM-based components. Experimental results indicate that DelTA significantly outperforms strong baselines in terms of translation consistency and quality across four open/closed-source LLMs and two representative document translation datasets, achieving an increase in consistency scores by up to 4.58 percentage points and in COMET scores by up to 3.16 points on average. DelTA employs a sentence-by-sentence translation strategy, ensuring no sentence omissions and offering a memory-efficient solution compared to the mainstream method. Furthermore, DelTA improves pronoun translation accuracy, and the summary component of the agent also shows promise as a tool for query-based summarization tasks. We release our code and data at https://github.com/YutongWang1216/DocMTAgent.
Related papers
- Multilingual Contextualization of Large Language Models for Document-Level Machine Translation [30.005159724115824]
Large language models (LLMs) have demonstrated strong performance in sentence-level machine translation.
We propose a method to improve LLM-based long-document translation through targeted fine-tuning on high-quality document-level data.
Our approach supports multiple translation paradigms, including direct document-to-document and chunk-level translation.
arXiv Detail & Related papers (2025-04-16T14:52:22Z) - Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement [19.513243503109035]
Large language models (LLMs) can enhance translation quality through self-refinement.
We build on this idea by extending the refinement from sentence-level to document-level translation.
Since sentence-to-sentence (Sent2Sent) and Doc2Doc translation address different aspects of the translation process, we propose fine-tuning LLMs for translation refinement.
arXiv Detail & Related papers (2025-04-08T02:08:07Z) - Improving LLM-based Document-level Machine Translation with Multi-Knowledge Fusion [21.533772761328656]
We propose an enhanced approach by incorporating multiple sources of knowledge, including both the document summarization and entity translation.
Our approach achieves an average improvement of 0.8, 0.6, and 0.4 COMET scores over the baseline without extra knowledge.
arXiv Detail & Related papers (2025-03-15T14:18:45Z) - Source-primed Multi-turn Conversation Helps Large Language Models Translate Documents [47.34053408385208]
We study a simple method for handling document-level machine translation, by leveraging previous contexts in a multi-turn conversational manner.
This method ensures coherent translations without additional training, and can fully re-use the KV cache of previous turns.
We empirically show this multi-turn method outperforms both translating entire documents in a single turn and translating each segment independently.
arXiv Detail & Related papers (2025-03-13T15:57:50Z) - Retrieval-Augmented Machine Translation with Unstructured Knowledge [74.84236945680503]
Retrieval-augmented generation (RAG) introduces additional information to enhance large language models (LLMs)
In machine translation (MT), previous work typically retrieves in-context examples from paired MT corpora, or domain-specific knowledge from knowledge graphs.
In this paper, we study retrieval-augmented MT using unstructured documents.
arXiv Detail & Related papers (2024-12-05T17:00:32Z) - LLM-based Translation Inference with Iterative Bilingual Understanding [52.46978502902928]
We propose a novel Iterative Bilingual Understanding Translation method based on the cross-lingual capabilities of large language models (LLMs)
The cross-lingual capability of LLMs enables the generation of contextual understanding for both the source and target languages separately.
The proposed IBUT outperforms several strong comparison methods.
arXiv Detail & Related papers (2024-10-16T13:21:46Z) - Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning [38.89119606657543]
In contrast to sentence-level translation, document-level translation (DOCMT) by large language models (LLMs) based on in-context learning faces two major challenges.
We propose a Context-Aware Prompting method (CAP) to generate more accurate, cohesive, and coherent translations via in-context learning.
We conduct extensive experiments across various DOCMT tasks, and the results demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-11T09:11:17Z) - LexMatcher: Dictionary-centric Data Collection for LLM-based Machine Translation [67.24113079928668]
We present LexMatcher, a method for data curation driven by the coverage of senses found in bilingual dictionaries.
Our approach outperforms the established baselines on the WMT2022 test sets.
arXiv Detail & Related papers (2024-06-03T15:30:36Z) - CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models [36.82189550072201]
Existing text-to-table datasets are typically oriented English.
Large language models (LLMs) have shown great success as general task solvers in multi-lingual settings.
We propose a Chinese text-to-table dataset, CT-Eval, to benchmark LLMs on this task.
arXiv Detail & Related papers (2024-05-20T16:58:02Z) - (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts [52.18246881218829]
We introduce a novel multi-agent framework based on large language models (LLMs) for literary translation, implemented as a company called TransAgents.
To evaluate the effectiveness of our system, we propose two innovative evaluation strategies: Monolingual Human Preference (MHP) and Bilingual LLM Preference (BLP)
arXiv Detail & Related papers (2024-05-20T05:55:08Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Soft Prompt Decoding for Multilingual Dense Retrieval [30.766917713997355]
We show that applying state-of-the-art approaches developed for cross-lingual information retrieval to MLIR tasks leads to sub-optimal performance.
This is due to the heterogeneous and imbalanced nature of multilingual collections.
We present KD-SPD, a novel soft prompt decoding approach for MLIR that implicitly "translates" the representation of documents in different languages into the same embedding space.
arXiv Detail & Related papers (2023-05-15T21:17:17Z) - Understanding Translationese in Cross-Lingual Summarization [106.69566000567598]
Cross-lingual summarization (MS) aims at generating a concise summary in a different target language.
To collect large-scale CLS data, existing datasets typically involve translation in their creation.
In this paper, we first confirm that different approaches of constructing CLS datasets will lead to different degrees of translationese.
arXiv Detail & Related papers (2022-12-14T13:41:49Z) - On Cross-Lingual Retrieval with Multilingual Text Encoders [51.60862829942932]
We study the suitability of state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks.
We benchmark their performance in unsupervised ad-hoc sentence- and document-level CLIR experiments.
We evaluate multilingual encoders fine-tuned in a supervised fashion (i.e., we learn to rank) on English relevance data in a series of zero-shot language and domain transfer CLIR experiments.
arXiv Detail & Related papers (2021-12-21T08:10:27Z) - mT6: Multilingual Pretrained Text-to-Text Transformer with Translation
Pairs [51.67970832510462]
We improve multilingual text-to-text transfer Transformer with translation pairs (mT6)
We explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption.
Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.
arXiv Detail & Related papers (2021-04-18T03:24:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.