Fugu-MT 論文翻訳(概要): Orchard: An Open-Source Agentic Modeling Framework

論文の概要: Orchard: An Open-Source Agentic Modeling Framework

arxiv url: http://arxiv.org/abs/2605.15040v2
Date: Thu, 21 May 2026 16:25:26 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-22 20:14:18.416157
Title: Orchard: An Open-Source Agentic Modeling Framework
Title（参考訳）: Orchard: オープンソースのエージェントモデリングフレームワーク
Authors: Baolin Peng, Wenlin Yao, Qianhui Wu, Hao Cheng, Xiao Yu, Rui Yang, Tao Ge, Alessandro Sordoni, Xingdi Yuan, Yelong Shen, Pengcheng He, Tong Zhang, Zhou Yu, Jianfeng Gao,
Abstract要約: スケーラブルなエージェントモデリングのためのオープンソースのフレームワークOrchardを紹介します。 Orchard Envは、サンドボックスライフサイクル管理のための再利用可能なプリミティブを提供する軽量環境サービスである。 Orchard Envの上に、3つのエージェントモデリングレシピを構築します。
参考スコア（独自算出の注目度）: 124.68499958175111
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agentic modeling aims to transform LLMs into autonomous agents capable of solving complex tasks through planning, reasoning, tool use, and multi-turn interaction with environments. Despite major investment, open research remains constrained by infrastructure and training gaps. Many high-performing systems rely on proprietary codebases, models, or services, while most open-source frameworks focus on orchestration and evaluation rather than scalable agent training. We present Orchard, an open-source framework for scalable agentic modeling. At its core is Orchard Env, a lightweight environment service providing reusable primitives for sandbox lifecycle management across task domains, agent harnesses, and pipeline stages. On top of Orchard Env, we build three agentic modeling recipes. Orchard-SWE targets coding agents. We distill 107K trajectories from MiniMax-M2.5 and Qwen3.5-397B, introduce credit-assignment SFT to learn from productive segments of unresolved trajectories, and apply Balanced Adaptive Rollout for RL. Starting from Qwen3-30B-A3B-Thinking, Orchard-SWE achieves 64.3% on SWE-bench Verified after SFT and 67.5% after SFT+RL, setting a new state of the art among open-source models of comparable size. Orchard-GUI trains a 4B vision-language computer-use agent using only 0.4K distilled trajectories and 2.2K open-ended tasks. It achieves 74.1%, 67.0%, and 64.0% success rates on WebVoyager, Online-Mind2Web, and DeepShop, respectively, making it the strongest open-source model while remaining competitive with proprietary systems. Orchard-Claw targets personal assistant agents. Trained with only 0.2K synthetic tasks, it achieves 59.6% pass@3 on Claw-Eval and 73.9% when paired with a stronger ZeroClaw harness. Collectively, these results show that a lightweight, open, harness-agnostic environment layer enables reusable agentic data, training recipes, and evaluations across domains.
Abstract（参考訳）: エージェントモデリングは、LLMを計画、推論、ツールの使用、環境とのマルチターンインタラクションを通じて複雑なタスクを解決できる自律エージェントに変換することを目的としている。大きな投資にもかかわらず、オープンリサーチはインフラとトレーニングのギャップによって制約されている。多くのハイパフォーマンスなシステムはプロプライエタリなコードベースやモデル、サービスに依存していますが、ほとんどのオープンソースフレームワークは、スケーラブルなエージェントトレーニングではなく、オーケストレーションと評価に重点を置いています。スケーラブルなエージェントモデリングのためのオープンソースのフレームワークOrchardを紹介します。 Orchard Envは、タスクドメイン、エージェントハーネス、パイプラインステージにわたるサンドボックスライフサイクル管理のための再利用可能なプリミティブを提供する軽量環境サービスである。 Orchard Envの上に、3つのエージェントモデリングレシピを構築します。 Orchard-SWEはコーディングエージェントをターゲットにしている。我々は,MiniMax-M2.5およびQwen3.5-397Bから107Kトラジェクトリを蒸留し,未解決トラジェクトリの生産性セグメントから学習するためのクレジット割り当てSFTを導入し,RLに平衡適応ロールアウトを適用した。 Qwen3-30B-A3B-Thinkingを皮切りに、Orchard-SWEはSFT+RL以降のSWE-benchで64.3%、SFT+RL以降の67.5%を達成した。 Orchard-GUIは、0.4Kの蒸留軌道と2.2Kのオープンエンドタスクのみを使用して、4B視覚言語コンピュータ使用エージェントを訓練する。 WebVoyager、Online-Mind2Web、DeepShopでそれぞれ74.1%、67.0%、64.0%の成功率を達成した。 Orchard-Clawはパーソナルアシスタントエージェントをターゲットにしている。わずか0.2Kの合成タスクで訓練され、より強力なZeroClawハーネスと組み合わせると、Claw-Evalで59.6%のpass@3、73.9%を達成した。これらの結果は、軽量でオープンでハーネスに依存しない環境層が、再利用可能なエージェントデータ、レシピのトレーニング、ドメイン間の評価を可能にすることを示している。

論文の概要: Orchard: An Open-Source Agentic Modeling Framework

関連論文リスト