Fugu-MT 論文翻訳(概要): LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

論文の概要: LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

arxiv url: http://arxiv.org/abs/2606.18023v1
Date: Tue, 16 Jun 2026 15:03:05 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-17 17:15:32.501433
Title: LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling
Title（参考訳）: LoopCoder-v2: 効率的なテスト時間計算スケーリングのために1回だけループする
Authors: Jian Yang, Shawn Guo, Wei Zhang, Tianyu Zheng, Yaxin Du, Haau-Sing Li, Jiajun Wu, Yue Song, Yan Xing, Qingsong Cai, Zelong Huang, Chuan Hao, Ran Tao, Xianglong Liu, Wayne Xin Zhao, Mingjie Tang, Weifeng Lv, Ming Zhou, Bryan Dai,
Abstract要約: 本研究では,ゲインコストの観点から,ループ数選択について検討する。この研究は18Tトークンのスクラッチからループ数が異なるループ数でLoopCoder-v2をトレーニングし、それにマッチした命令チューニングと評価を行う。実証的には、この2ループ版はコード生成、コード推論、エージェントソフトウェアエンジニアリング、ツールスベンチマークなど、非ループベースラインよりも幅広い利益をもたらしている。
参考スコア（独自算出の注目度）: 72.71005779366162
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Looped Transformers scale latent computation by repeatedly applying shared blocks, but sequential looping increases latency and KV-cache memory with the loop count. Parallel loop Transformers (PLT) alleviate this cost through cross-loop position offsets (CLP) and shared-KV gated sliding-window attention, making loop count a practical design choice. We therefore study PLT loop-count selection through a gain--cost view: an extra loop may refine representations, but CLP also introduces a positional mismatch at each loop boundary. We instantiate this study by training LoopCoder-v2, a family of 7B PLT coders with different loop counts, from scratch on 18T tokens, followed by matched instruction tuning and evaluation. Empirically, the two-loop variant delivers broad gains over the non-looped baseline across code generation, code reasoning, agentic software engineering, and tool-use benchmarks, improving SWE-bench Verified from 43.0 to 64.4 points and Multi-SWE from 14.0 to 31.0 points. In contrast, variants with three or more loops regress, revealing a strongly non-monotonic loop-count effect. Our diagnostics show that loop 2 provides the main productive refinement, while later loops yield diminishing, oscillatory updates and reduced representational diversity. Because the CLP-induced mismatch remains roughly fixed as refinement gains shrink, the offset cost increasingly dominates. This gain--cost trade-off explains PLT's saturation at two loops and provides diagnostics for loop-count selection.
Abstract（参考訳）: Looped Transformerは、繰り返し共有ブロックを適用することで遅延計算をスケールするが、シーケンシャルループはループ数でレイテンシとKVキャッシュメモリを増大させる。並列ループ変換器(PLT)はこのコストを、クロスループ位置オフセット(CLP)と共有KVゲートのスライディングウインドウの注意によって軽減し、ループカウントを実用的な設計選択とした。そこで我々はPLTループ数選択をゲインコストの観点から検討し、余剰ループは表現を洗練させるが、CLPは各ループ境界における位置ミスマッチも導入する。この研究は、ループ数が異なる7B PLTコーダのファミリーであるLoopCoder-v2を18Tトークンのスクラッチからトレーニングし、その後、一致した命令チューニングと評価を行う。 2ループ版は、コード生成、コード推論、エージェント・ソフトウェア工学、ツール・ユース・ベンチマーク、SWE-bench Verifiedを43.0から64.4ポイントに改善し、Multi-SWEを14.0から31.0ポイントに改善した。対照的に、3つ以上のループを持つ変種は回帰し、強い非単調なループ数効果を示す。我々の診断では、ループ2が主な生産的改善を提供する一方で、後続のループは減少、振動的更新、表現の多様性の低下をもたらす。 CLPによって引き起こされるミスマッチは、精製が減少するにつれて大まかに固定されているため、オフセットコストはますます支配的になる。このゲインコストトレードオフは、PLTの飽和を2ループで説明し、ループ数選択の診断を提供する。

論文の概要: LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling

関連論文リスト