Self-Improving Multilingual Long Reasoning via Translation-Reasoning Integrated Training
- URL: http://arxiv.org/abs/2602.05940v1
- Date: Thu, 05 Feb 2026 17:55:09 GMT
- Title: Self-Improving Multilingual Long Reasoning via Translation-Reasoning Integrated Training
- Authors: Junxiao Liu, Zhijun Wang, Yixiao Li, Zhejian Lai, Liqian Huang, Xin Huang, Xue Han, Junlan Feng, Shujian Huang,
- Abstract summary: Long reasoning models often struggle in multilingual settings.<n>We propose TRIT (Translation-Reasoning Integrated Training), a self-improving framework that integrates the training of translation into multilingual reasoning.
- Score: 50.177839592528294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long reasoning models often struggle in multilingual settings: they tend to reason in English for non-English questions; when constrained to reasoning in the question language, accuracies drop substantially. The struggle is caused by the limited abilities for both multilingual question understanding and multilingual reasoning. To address both problems, we propose TRIT (Translation-Reasoning Integrated Training), a self-improving framework that integrates the training of translation into multilingual reasoning. Without external feedback or additional multilingual data, our method jointly enhances multilingual question understanding and response generation. On MMATH, our method outperforms multiple baselines by an average of 7 percentage points, improving both answer correctness and language consistency. Further analysis reveals that integrating translation training improves cross-lingual question alignment by over 10 percentage points and enhances translation quality for both mathematical questions and general-domain text, with gains up to 8.4 COMET points on FLORES-200.
Related papers
- When Meanings Meet: Investigating the Emergence and Quality of Shared Concept Spaces during Multilingual Language Model Training [57.230355403478995]
We investigate the development of language-agnostic concept spaces during pretraining of EuroLLM.<n>We find that shared concept spaces emerge early and continue to refine, but that alignment with them is language-dependent.<n>In contrast to prior work, our fine-grained manual analysis reveals that some apparent gains in translation quality reflect shifts in behavior.
arXiv Detail & Related papers (2026-01-30T11:23:01Z) - Align to the Pivot: Dual Alignment with Self-Feedback for Multilingual Math Reasoning [71.4175109189942]
We present Pivot-Aligned Self-Feedback Multilingual Reasoning (PASMR)<n>This approach designates the model's primary language as the pivot language.<n>It establishes a cross-lingual self-feedback mechanism without relying on external correct answers or reward models.
arXiv Detail & Related papers (2026-01-25T03:20:00Z) - Do Language Models Reason Across Languages? [19.660512783888016]
We find that language models are more sensitive to language variation in answer-span documents than in those providing bridging information.<n>We propose a simple three-stage SUBQ prompting method to guide the multi-step reasoning with sub-questions.
arXiv Detail & Related papers (2026-01-10T17:59:34Z) - Think Natively: Unlocking Multilingual Reasoning with Consistency-Enhanced Reinforcement Learning [85.7304930030649]
We propose M-Thinker, which is trained by a Language Consistency reward and a Cross-lingual Thinking Alignment reward.<n>M-Thinker achieves nearly 100% language consistency and superior performance on two multilingual benchmarks.
arXiv Detail & Related papers (2025-10-08T17:55:02Z) - Multilinguality Does not Make Sense: Investigating Factors Behind Zero-Shot Transfer in Sense-Aware Tasks [3.274367403737527]
Cross-lingual transfer is central to modern NLP, enabling models to perform tasks in languages different from those they were trained on.<n>A common assumption is that training on more languages improves zero-shot transfer.<n>We test this on sense-aware tasks-polysemy and lexical semantic change-and find that multilinguality is not necessary for effective transfer.
arXiv Detail & Related papers (2025-05-30T17:36:20Z) - Demystifying Multilingual Chain-of-Thought in Process Reward Modeling [86.98098988779809]
We tackle the challenge of extending process reward models (PRMs) to multilingual settings.<n>We train multilingual PRMs on a dataset spanning seven languages, which is translated from English.<n>Our results highlight the sensitivity of multilingual PRMs to both the number of training languages and the volume of English data.
arXiv Detail & Related papers (2025-02-18T09:11:44Z) - Delving Deeper into Cross-lingual Visual Question Answering [115.16614806717341]
We show that simple modifications to the standard training setup can substantially reduce the transfer gap to monolingual English performance.
We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers.
arXiv Detail & Related papers (2022-02-15T18:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.