Fugu-MT 論文翻訳(概要): Learning Reasoning World Models for Parallel Code

論文の概要: Learning Reasoning World Models for Parallel Code

arxiv url: http://arxiv.org/abs/2604.20926v1
Date: Wed, 22 Apr 2026 07:29:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-24 14:40:06.098511
Title: Learning Reasoning World Models for Parallel Code
Title（参考訳）: 並列コードのための世界モデルの推論学習
Authors: Gautam Singh, Arjun Guha, Bhavya Kailkhura, Harshitha Menon,
Abstract要約: 一般的な治療法は、外部ツールと対話するコーディングエージェントを使用することだが、ツールコールはコストがかかり、時には実用的でない場合もある。並列ソースコードから直接ツールの結果を予測することを目的として,PCWM(Parallel-Code World Models)を提案する。この結果から,推論モデルが並列符号化エージェントにおける外部ツールコールの代替となる可能性が示唆された。
参考スコア（独自算出の注目度）: 26.139975890051925
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models have shown remarkable ability in serial code generation, but they still struggle with parallel code for which training data is comparatively scarce. A common remedy is to use coding agents that interact with external tools, but tool calls can be costly and sometimes impractical, e.g., for partially written code. We propose Parallel-Code World Models (PCWMs), reasoning LLMs that aim to predict tool outcomes directly from parallel source code. To train PCWMs, we design a novel exploration and data generation pipeline that samples diverse parallel-coding problems and candidate implementations across multiple domains, then executes them via tools to record data races and performance profiles. From these, we synthesize hindsight reasoning traces that causally connect source code to observed tool outcomes. Fine-tuning on the resulting data yields noticeable gains, with a 7B-parameter world model improving from 64.3% to 72.8% accuracy in race-outcome prediction, while an 8B-parameter model improves in a performance profiling task from 49.3% to 58.6% accuracy. Furthermore, when open-weight models were tasked with fixing data races, world-model feedback improved their race-fixing rates relative to self-feedback by 2.7%-9.1% using our 7B-parameter world model and by 6.1%-11.1% using our 14B-parameter world model. Our results suggest that reasoning models have the potential to serve as practical substitutes for external tool calls in parallel-coding agents.
Abstract（参考訳）: 大規模言語モデルはシリアルコード生成において顕著な能力を示してきたが、訓練データが比較的少ない並列コードに苦戦している。一般的な治療法は、外部ツールと対話するコーディングエージェントを使用することだが、ツールコールは、部分的に記述されたコードに対して、コストがかかり、時には実用的でない場合もある。並列ソースコードから直接ツールの結果を予測することを目的としたLCM(Parallel-Code World Models)を提案する。 PCWMをトレーニングするために、並列コーディング問題や複数のドメインにまたがる候補実装を抽出し、データ競合やパフォーマンスプロファイルを記録するツールを介して実行する、新しい探索およびデータ生成パイプラインを設計する。これらから,ソースコードと観察ツールの結果を因果的に結合する後視的推論トレースを合成する。 7Bパラメータの世界モデルは、レースアウトカム予測において64.3%から72.8%の精度で改善され、8Bパラメータモデルはパフォーマンスプロファイリングタスクにおいて49.3%から58.6%の精度で改善された。さらに、オープンウェイトモデルがデータレースの修正を行う場合、世界モデルフィードバックは、我々の7Bパラメーター世界モデルを用いて自己フィードバックに対するレース修正率を2.7%-9.1%改善し、我々の14Bパラメーター世界モデルを用いて6.1%-11.1%改善した。この結果から,推論モデルが並列符号化エージェントにおける外部ツールコールの代替となる可能性が示唆された。

論文の概要: Learning Reasoning World Models for Parallel Code

関連論文リスト