Enhancing Document-level Translation of Large Language Model via
Translation Mixed-instructions
- URL: http://arxiv.org/abs/2401.08088v1
- Date: Tue, 16 Jan 2024 03:28:26 GMT
- Title: Enhancing Document-level Translation of Large Language Model via
Translation Mixed-instructions
- Authors: Yachao Li, Junhui Li, Jing Jiang and Min Zhang
- Abstract summary: Existing large language models (LLMs) for machine translation are typically fine-tuned on sentence-level translation instructions.
This challenge arises from the issue of sentence-level coverage, where subsequent sentences in the document remain untranslated.
We propose an approach that combines sentence-level and document-level translation instructions of varying lengths to fine-tune LLMs.
- Score: 24.025242477280983
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing large language models (LLMs) for machine translation are typically
fine-tuned on sentence-level translation instructions and achieve satisfactory
performance at the sentence level. However, when applied to document-level
translation, these models face a significant challenge, particularly when
dealing with documents containing over 512 tokens. This challenge arises from
the issue of sentence-level coverage, where subsequent sentences in the
document remain untranslated. As a result, the document-level translation
capability of LLMs fine-tuned on sentence-level translation instructions is
significantly limited. We conjecture that the primary cause of LLMs' weak
document-level translation performance is the absence of document-to-document
mapping ability. To address the issue, we propose an approach that combines
sentence-level and document-level translation instructions of varying lengths
to fine-tune LLMs. Our proposed translation mixed-instructions enable LLMs
(Llama-2~7B and 13B) to maintain consistent translation performance from the
sentence level to documents containing as many as 2048 tokens. Extensive
experimental results show that the proposed approach significantly enhances the
document-level translation capabilities of LLMs on 10 language pairs,
effectively mitigating the sentence-level coverage issue in document-level
translation. Experimentation on discourse phenomena has demonstrated that our
document-level translation approach significantly improves translation quality,
both in terms of BLEU score and discourse coherence.
Related papers
- Instruction-Tuned LLMs Succeed in Document-Level MT Without Fine-Tuning -- But BLEU Turns a Blind Eye [15.987448306012167]
Large language models (LLMs) have excelled in various NLP tasks, including machine translation (MT)
This work investigates the inherent capability of instruction-tuned LLMs for document-level translation (docMT)
arXiv Detail & Related papers (2024-10-28T11:49:58Z) - Analyzing Context Utilization of LLMs in Document-Level Translation [10.174848090916669]
Large language models (LLM) are increasingly strong contenders in machine translation.
We study document-level translation, where some words cannot be translated without context from outside the sentence.
We find that LLMs' improved document-translation performance is not always reflected in pronoun translation performance.
arXiv Detail & Related papers (2024-10-18T11:52:10Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks.
Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning.
This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z) - Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks.
We show that they have yet to attain state-of-the-art performance in Neural Machine Translation.
We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z) - Large language models effectively leverage document-level context for
literary translation, but critical errors persist [32.54546652197316]
Large language models (LLMs) are competitive with the state of the art on a wide range of sentence-level translation datasets.
We show through a rigorous human evaluation that asking the Gpt-3.5 (text-davinci-003) LLM to translate an entire literary paragraph results in higher-quality translations.
arXiv Detail & Related papers (2023-04-06T17:27:45Z) - Dictionary-based Phrase-level Prompting of Large Language Models for
Machine Translation [91.57514888410205]
Large language models (LLMs) demonstrate remarkable machine translation (MT) abilities via prompting.
LLMs can struggle to translate inputs with rare words, which are common in low resource or domain transfer scenarios.
We show that LLM prompting can provide an effective solution for rare words as well, by using prior knowledge from bilingual dictionaries to provide control hints in the prompts.
arXiv Detail & Related papers (2023-02-15T18:46:42Z) - Modeling Context With Linear Attention for Scalable Document-Level
Translation [72.41955536834702]
We investigate the efficacy of a recent linear attention model on document translation and augment it with a sentential gate to promote a recency inductive bias.
We show that sentential gating further improves translation quality on IWSLT.
arXiv Detail & Related papers (2022-10-16T03:41:50Z) - Leveraging Discourse Rewards for Document-Level Neural Machine
Translation [46.006636555165414]
We propose a training approach that explicitly optimize two established discourse metrics, lexical cohesion (LC) and coherence (COH)
Our training approach has been able to achieve more cohesive and coherent document translations than other competitive approaches.
arXiv Detail & Related papers (2020-10-08T02:26:22Z) - Document-level Neural Machine Translation with Document Embeddings [82.4684444847092]
This work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings.
The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end.
arXiv Detail & Related papers (2020-09-16T19:43:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.