Related papers: From Utterance to Vividity: Training Expressive Subtitle Translation LLM via Adaptive Local Preference Optimization

From Utterance to Vividity: Training Expressive Subtitle Translation LLM via Adaptive Local Preference Optimization

URL: http://arxiv.org/abs/2602.01068v1
Date: Sun, 01 Feb 2026 07:24:06 GMT
Title: From Utterance to Vividity: Training Expressive Subtitle Translation LLM via Adaptive Local Preference Optimization
Authors: Chaoqun Cui, Shijing Wang, Liangbin Huang, Qingqing Gu, Zhaolong Huang, Xiao Zeng, Wenji Mao,
Abstract summary: We focus on how to construct translation LLMs that meet the needs of domain customization.<n>We take visual media subtitle translation as our topic and explore how to train expressive and vivid translation LLMs.
Score: 12.547838537411215
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid development of Large Language Models (LLMs) has significantly enhanced the general capabilities of machine translation. However, as application scenarios become more complex, the limitations of LLMs in vertical domain translations are gradually becoming apparent. In this study, we focus on how to construct translation LLMs that meet the needs of domain customization. We take visual media subtitle translation as our topic and explore how to train expressive and vivid translation LLMs. We investigated the situations of subtitle translation and other domains of literal and liberal translation, verifying the reliability of LLM as reward model and evaluator for translation. Additionally, to train an expressive translation LLM, we constructed and released a multidirectional subtitle parallel corpus dataset and proposed the Adaptive Local Preference Optimization (ALPO) method to address fine-grained preference alignment. Experimental results demonstrate that ALPO achieves outstanding performance in multidimensional evaluation of translation quality.

Related papers

PEGRL: Improving Machine Translation by Post-Editing Guided Reinforcement Learning [54.19784655270799]
We introduce textbfPEGRL, a textittwo-stage RL framework that uses post-editing as an auxiliary task to stabilize training and guide overall optimization.<n>Experiments on English$to$Finnish, English$to$Turkish, and English$leftrightarrow$Chinese show consistent gains over RL baselines.
arXiv Detail & Related papers (2026-02-03T10:22:55Z)
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs [51.04435855143767]
Large language models (LLMs) have achieved remarkable success in machine translation.<n>However, translationese, characterized by overly literal and unnatural translations, remains a persistent challenge.<n>We introduce methods to mitigate these biases, including polishing golden references and filtering unnatural training instances.
arXiv Detail & Related papers (2025-03-06T12:14:45Z)
Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models [25.88443566366613]
Large language models (LLMs) with in-context learning have demonstrated remarkable potential in handling neural machine translation.<n>Existing evidence shows that LLMs are prompt-sensitive and it is sub-optimal to apply the fixed prompt to any input for downstream machine translation tasks.<n>We propose an adaptive few-shot prompting framework to automatically select suitable translation demonstrations for various source input sentences.
arXiv Detail & Related papers (2025-01-03T07:47:59Z)
Context-Aware or Context-Insensitive? Assessing LLMs' Performance in Document-Level Translation [10.174848090916669]
Large language models (LLMs) are increasingly strong contenders in machine translation.<n>We focus on document-level translation, where some words cannot be translated without context from outside the sentence.
arXiv Detail & Related papers (2024-10-18T11:52:10Z)
LLM-based Translation Inference with Iterative Bilingual Understanding [52.46978502902928]
We propose a novel Iterative Bilingual Understanding Translation method based on the cross-lingual capabilities of large language models (LLMs)<n>The cross-lingual capability of LLMs enables the generation of contextual understanding for both the source and target languages separately.<n>The proposed IBUT outperforms several strong comparison methods.
arXiv Detail & Related papers (2024-10-16T13:21:46Z)
A Preference-driven Paradigm for Enhanced Translation with Large Language Models [33.51585908894444]
Large language models (LLMs) can achieve remarkable translation performance using only a small amount of parallel data. SFT simply instructs the model to imitate the reference translations at the token level, making it vulnerable to the noise present in the references. We propose a preference-based approach built upon the Plackett-Luce model to overcome this plateau.
arXiv Detail & Related papers (2024-04-17T11:52:47Z)
Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages. Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs. In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z)
Adapting Large Language Models for Document-Level Machine Translation [46.370862171452444]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks. Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning. This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs.
arXiv Detail & Related papers (2024-01-12T09:29:13Z)
Speech Translation with Large Language Models: An Industrial Practice [64.5419534101104]
We introduce LLM-ST, a novel and effective speech translation model constructed upon a pre-trained large language model (LLM) By integrating the large language model (LLM) with a speech encoder and employing multi-task instruction tuning, LLM-ST can produce accurate timestamped transcriptions and translations. Through rigorous experimentation on English and Chinese datasets, we showcase the exceptional performance of LLM-ST.
arXiv Detail & Related papers (2023-12-21T05:32:49Z)
Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing [12.843274390224853]
Large Language Models (LLM's) have demonstrated considerable success in various Natural Language Processing tasks. We show that they have yet to attain state-of-the-art performance in Neural Machine Translation. We propose adapting LLM's as Automatic Post-Editors (APE) rather than direct translators.
arXiv Detail & Related papers (2023-10-23T12:22:15Z)
VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment [52.489874804051304]
VoLTA is a new vision-language pre-training paradigm that only utilizes image-caption data but fine-grained region-level image understanding. VoLTA pushes multi-modal fusion deep into the uni-modal backbones during pre-training. Experiments on a wide range of vision- and vision-language downstream tasks demonstrate the effectiveness of VoLTA.
arXiv Detail & Related papers (2022-10-09T01:49:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.