DeepTrans: Deep Reasoning Translation via Reinforcement Learning
- URL: http://arxiv.org/abs/2504.10187v2
- Date: Fri, 29 Aug 2025 09:58:42 GMT
- Title: DeepTrans: Deep Reasoning Translation via Reinforcement Learning
- Authors: Jiaan Wang, Fandong Meng, Jie Zhou,
- Abstract summary: We introduce DeepTrans, a deep reasoning translation model that learns free translation via reinforcement learning (RL)<n>Using Qwen2.5-7B as the backbone, DeepTrans improves performance by 16.3% in literature translation.<n>We summarize the failures and interesting findings during our RL exploration.
- Score: 65.96268429761842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, deep reasoning LLMs (e.g., OpenAI o1 and DeepSeek-R1) have shown promising performance in various downstream tasks. Free translation is an important and interesting task in the multilingual world, which requires going beyond word-for-word translation. However, the task is still under-explored in deep reasoning LLMs. In this paper, we introduce DeepTrans, a deep reasoning translation model that learns free translation via reinforcement learning (RL). Specifically, we carefully build a reward model with pre-defined scoring criteria on both the translation results and the thought processes. The reward model teaches DeepTrans how to think and free-translate the given sentences during RL. Besides, our RL training does not need any labeled translations, avoiding the human-intensive annotation or resource-intensive data synthesis. Experimental results show the effectiveness of DeepTrans. Using Qwen2.5-7B as the backbone, DeepTrans improves performance by 16.3% in literature translation, and outperforms strong deep reasoning LLMs. Moreover, we summarize the failures and interesting findings during our RL exploration. We hope this work could inspire other researchers in free translation.
Related papers
- Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation [18.00698389204074]
We show no clear evidence that performance gains stem from explicitly decomposing the translation process via Chain-of-Thought reasoning.<n>While the decomposition influences translation behaviour, faithfulness to the decomposition has both positive and negative effects on translation.
arXiv Detail & Related papers (2025-06-05T00:04:39Z) - Compensating for Data with Reasoning: Low-Resource Machine Translation with LLMs [0.0]
Fragment-Shot Prompting is a novel in-context learning method that segments input and retrieves translation examples based on syntactic coverage.<n> Pivoted Fragment-Shot is an extension that enables translation without direct parallel data.<n>We evaluate these methods using GPT-3.5, GPT-4o, o1-mini, LLaMA-3.3, and DeepSeek-R1 for translation between Italian and two Ladin variants.
arXiv Detail & Related papers (2025-05-28T12:29:05Z) - TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment [18.162673576513836]
We propose textbfTAT-R1, a terminology-aware translation model trained with reinforcement learning and word alignment.<n>Our model significantly improves terminology translation accuracy compared to the baseline models.
arXiv Detail & Related papers (2025-05-27T13:26:02Z) - ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning [77.41383117199227]
We design a new reward modeling method that compares the translation results of the policy MT model with a strong LRM.<n>Using Qwen2.5-7B-Instruct as the backbone, the trained model achieves the new state-of-the-art performance in literary translation.<n>We extend our method to the multilingual settings with 11 languages.
arXiv Detail & Related papers (2025-05-19T11:34:47Z) - Lost in Literalism: How Supervised Training Shapes Translationese in LLMs [51.04435855143767]
Large language models (LLMs) have achieved remarkable success in machine translation.<n>However, translationese, characterized by overly literal and unnatural translations, remains a persistent challenge.<n>We introduce methods to mitigate these biases, including polishing golden references and filtering unnatural training instances.
arXiv Detail & Related papers (2025-03-06T12:14:45Z) - DRT: Deep Reasoning Translation via Long Chain-of-Thought [89.48208612476068]
In this paper, we introduce DRT, an attempt to bring the success of long CoT to neural machine translation (MT)<n>We first mine sentences containing similes or metaphors from existing literature books, and then develop a multi-agent framework to translate these sentences via long thought.<n>Using Qwen2.5 and LLama-3.1 as the backbones, DRT models can learn the thought process during machine translation.
arXiv Detail & Related papers (2024-12-23T11:55:33Z) - Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation [11.875491080062233]
Neural machine translation (NMT) systems amplify lexical biases present in their training data, leading to artificially impoverished language in output translations.<n>We introduce a novel method that rewards both naturalness and content preservation.<n>We evaluate our method on English-to-Dutch literary translation, and find that our best model produces translations that are lexically richer and exhibit more properties of human-written language, without loss in translation accuracy.
arXiv Detail & Related papers (2024-12-11T15:42:22Z) - TasTe: Teaching Large Language Models to Translate through Self-Reflection [82.83958470745381]
Large language models (LLMs) have exhibited remarkable performance in various natural language processing tasks.
We propose the TasTe framework, which stands for translating through self-reflection.
The evaluation results in four language directions on the WMT22 benchmark reveal the effectiveness of our approach compared to existing methods.
arXiv Detail & Related papers (2024-06-12T17:21:21Z) - Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice? [33.376648335299116]
Large language models (LLMs) display strong translation capability after being fine-tuned on as few as 32 parallel sentences.
LLMs with only English on the target side can lead to task misinterpretation, which hinders translation into non-English languages.
synthesized data in an under-represented language has a less pronounced effect.
arXiv Detail & Related papers (2024-04-22T12:21:12Z) - Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning [57.323716555996114]
Off-target translation remains an unsolved problem, especially for low-resource languages.
Recent works have either designed advanced prompting strategies to highlight the functionality of translation instructions or exploited the in-context learning ability of LLMs.
In this work, we design a two-stage fine-tuning algorithm to improve the instruction-following ability (especially the translation direction) of LLMs.
arXiv Detail & Related papers (2024-03-21T13:47:40Z) - Machine Translation Models are Zero-Shot Detectors of Translation Direction [46.41883195574249]
Detecting the translation direction of parallel text has applications for machine translation training and evaluation, but also has forensic applications such as resolving plagiarism or forgery allegations.<n>In this work, we explore an unsupervised approach to translation direction detection based on the simple hypothesis that $p(texttranslation|textoriginal)>p(textoriginal|texttranslation)$, motivated by the well-known simplification effect in translationese or machine-translationese.
arXiv Detail & Related papers (2024-01-12T18:59:02Z) - Crossing the Threshold: Idiomatic Machine Translation through Retrieval
Augmentation and Loss Weighting [66.02718577386426]
We provide a simple characterization of idiomatic translation and related issues.
We conduct a synthetic experiment revealing a tipping point at which transformer-based machine translation models correctly default to idiomatic translations.
To improve translation of natural idioms, we introduce two straightforward yet effective techniques.
arXiv Detail & Related papers (2023-10-10T23:47:25Z) - Towards Debiasing Translation Artifacts [15.991970288297443]
We propose a novel approach to reducing translationese by extending an established bias-removal technique.
We use the Iterative Null-space Projection (INLP) algorithm, and show by measuring classification accuracy before and after debiasing, that translationese is reduced at both sentence and word level.
To the best of our knowledge, this is the first study to debias translationese as represented in latent embedding space.
arXiv Detail & Related papers (2022-05-16T21:46:51Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.