Fugu-MT 論文翻訳(概要): CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora

論文の概要: CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora

arxiv url: http://arxiv.org/abs/2604.18027v1
Date: Mon, 20 Apr 2026 09:52:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.79661
Title: CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora
Title（参考訳）: コードPivot:並列コーパスのない強化学習によるLLMにおけるブートストラップ多言語翻訳
Authors: Shangyu Li, Juyong Jiang, Meibo Ren, Sizhe Zhong, Huiri Tan, Yunhao Gou, Xu Han, Chun Yong Chong, Yun Peng, Jiasi Shen,
Abstract要約: トランスパイレーション(Transpilation、コード翻訳)は、ある言語から別の言語にソースコードを変換することを目的としている。最近の大規模言語モデル (LLM) に基づくアプローチは、コード翻訳に大きな可能性を示している。並列コーパスを必要とせずに、モデルの多言語翻訳能力をブートストラップするトレーニングフレームワークであるCodePivotを提案する。
参考スコア（独自算出の注目度）: 12.250493747181459
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transpilation, or code translation, aims to convert source code from one programming language (PL) to another. It is beneficial for many downstream applications, from modernizing large legacy codebases to augmenting data for low-resource PLs. Recent large language model (LLM)-based approaches have demonstrated immense potential for code translation. Among these approaches, training-based methods are particularly important because LLMs currently do not effectively adapt to domain-specific settings that suffer from a lack of knowledge without targeted training. This limitation is evident in transpilation tasks involving low-resource PLs. However, existing training-based approaches rely on a pairwise transpilation paradigm, making it impractical to support a diverse range of PLs. This limitation is particularly prominent for low-resource PLs due to a scarcity of training data. Furthermore, these methods suffer from suboptimal reinforcement learning (RL) reward formulations. To address these limitations, we propose CodePivot, a training framework that leverages Python as an intermediate representation (IR), augmented by a novel RL reward mechanism, Aggressive-Partial-Functional reward, to bootstrap the model's multilingual transpilation ability without requiring parallel corpora. Experiments involving 10 PLs show that the resulting 7B model, trained on Python-to-Others tasks, consistently improves performance across both general and low-resource PL-related transpilation tasks. It outperforms substantially larger mainstream models with hundreds of billions more parameters, such as Deepseek-R1 and Qwen3-235B-A22B-Instruct-2507, on Python-to-Others tasks and Others-to-All tasks, respectively. In addition, it outperforms its counterpart trained directly on Any-to-Any tasks on general transpilation tasks. The code and data are available at https://github.com/lishangyu-hkust/CodePivot.
Abstract（参考訳）: Transpilation(コード翻訳)は、あるプログラミング言語(PL)から別のプログラミング言語へソースコードを変換することを目的としている。大規模なレガシーコードベースの近代化や、低リソースのPLのデータ拡張など、多くのダウンストリームアプリケーションにとって有益である。最近の大規模言語モデル (LLM) に基づくアプローチは、コード翻訳に大きな可能性を示している。これらのアプローチの中で、トレーニングベースの手法が特に重要であるのは、現在LLMは、目標とするトレーニングなしで知識の不足に苦しむドメイン固有の設定に効果的に適応していないためである。この制限は、低リソースPLを含むトランスパイルタスクにおいて明らかである。しかし、既存のトレーニングベースのアプローチはペアワイズトランスパイルパラダイムに依存しており、多様なPLをサポートすることは不可能である。この制限は、トレーニングデータの不足のため、低リソースPLでは特に顕著である。さらに,これらの手法は準最適強化学習(RL)報酬の定式化に悩まされる。このような制限に対処するために,新しいRL報酬機構であるAggressive-Partial-Functional rewardによって強化された,Pythonを中間表現(IR)として活用するトレーニングフレームワークであるCodePivotを提案する。 10個のPLを含む実験では、Python-to-Othersタスクでトレーニングされた結果の7Bモデルが、一般的なPL関連のトランスパイラタスクと低リソースのPL関連のトランスパイラタスクの両方のパフォーマンスを一貫して改善している。 Deepseek-R1やQwen3-235B-A22B-Instruct-2507のように、Python-to-OthersタスクとOthers-to-Allタスクにおいて、数十億以上のパラメータを持つ、かなり大きなメインストリームモデルを上回っている。さらに、一般的なトランスパイルタスクにおいて、Any-to-Anyタスクで直接訓練された他のタスクよりも優れています。コードとデータはhttps://github.com/lishangyu-hkust/CodePivotで公開されている。

論文の概要: CodePivot: Bootstrapping Multilingual Transpilation in LLMs via Reinforcement Learning without Parallel Corpora

関連論文リスト