Fugu-MT 論文翻訳(概要): Composer 2 Technical Report

論文の概要: Composer 2 Technical Report

arxiv url: http://arxiv.org/abs/2603.24477v1
Date: Wed, 25 Mar 2026 16:18:37 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-26 21:06:11.383429
Title: Composer 2 Technical Report
Title（参考訳）: コンストラクタ2 技術報告
Authors: Cursor Reseach, :, Aaron Chan, Ahmed Shalaby, Alexander Wettig, Aman Sanger, Andrew Zhai, Anurag Ajay, Ashvin Nair, Charlie Snell, Chen Lu, Chen Shen, Emily Jia, Federico Cassano, Hanpeng Liu, Haoyu Chen, Henry Wildermuth, Jacob Jackson, Janet Li, Jediah Katz, Jiajun Yao, Joey Hejna, Josh Warner, Julius Vering, Kevin Frans, Lee Danilek, Less Wright, Lujing Cen, Luke Melas-Kyriazi, Michael Truell, Michiel de Jong, Naman Jain, Nate Schmidt, Nathan Wang, Niklas Muennighoff, Oleg Rybkin, Paul Loh, Phillip Kravtsov, Rishabh Yadav, Sahil Shah, Sam Kottler, Alexander M Rush, Shengtong Zhang, Shomil Jain, Sriram Sankar, Stefan Heule, Stuart H. Sul, Sualeh Asif, Victor Rong, Wanqi Zhu, William Lin, Yuchen Wu, Yuri Volkov, Yury Zemlyanskiy, Zack Holbrook, Zhiyuan Zhang,
Abstract要約: Composer 2はエージェントソフトウェアエンジニアリング用に設計された特殊なモデルである。モデルは2つのフェーズでトレーニングされる。まず、モデルの知識を改善するための事前トレーニングと、潜伏するコーディング能力だ。デプロイされたモデルで使用されるのと同じカーソルハーネスでトレーニングをサポートするインフラを開発する。
参考スコア（独自算出の注目度）: 93.84516486051359
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Composer 2 is a specialized model designed for agentic software engineering. The model demonstrates strong long-term planning and coding intelligence while maintaining the ability to efficiently solve problems for interactive use. The model is trained in two phases: first, continued pretraining to improve the model's knowledge and latent coding ability, followed by large-scale reinforcement learning to improve end-to-end coding performance through stronger reasoning, accurate multi-step execution, and coherence on long-horizon realistic coding problems. We develop infrastructure to support training in the same Cursor harness that is used by the deployed model, with equivalent tools and structure, and use environments that match real problems closely. To measure the ability of the model on increasingly difficult tasks, we introduce a benchmark derived from real software engineering problems in large codebases including our own. Composer 2 is a frontier-level coding model and demonstrates a process for training strong domain-specialized models. On our CursorBench evaluations the model achieves a major improvement in accuracy compared to previous Composer models (61.3). On public benchmarks the model scores 61.7 on Terminal-Bench and 73.7 on SWE-bench Multilingual in our harness, comparable to state-of-the-art systems.
Abstract（参考訳）: Composer 2はエージェントソフトウェアエンジニアリング用に設計された特殊なモデルである。このモデルは、対話的な使用のための問題を効率的に解決する能力を維持しながら、強力な長期計画とコーディングインテリジェンスを示す。まず、モデルの知識と潜伏するコーディング能力を改善するために事前訓練を継続し、続いて大規模強化学習を行い、より強力な推論、正確な多段階実行、長期の現実的なコーディング問題に対する一貫性を通じてエンドツーエンドのコーディング性能を改善する。私たちは、デプロイされたモデルで使用されるのと同じカーソルハーネスで、同等のツールと構造でトレーニングをサポートするインフラを開発し、実際の問題に密接にマッチする環境を使用します。ますます困難なタスクにおけるモデルの能力を測定するため、我々は、私たち自身を含む大規模なコードベースにおいて、実際のソフトウェア工学上の問題から派生したベンチマークを導入する。 Composer 2はフロンティアレベルのコーディングモデルであり、強力なドメイン特化モデルのトレーニングプロセスを示す。 CursorBench の評価では,従来のComposer モデル (61.3) と比較して精度が大幅に向上した。公的なベンチマークでは、端末ベンチで61.7、SWEベンチで73.7、我々のハーネスでSWEベンチマルチリンガルで、最先端のシステムに匹敵する。

論文の概要: Composer 2 Technical Report

関連論文リスト