Fugu-MT 論文翻訳(概要): Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition

論文の概要: Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition

arxiv url: http://arxiv.org/abs/2604.17009v1
Date: Sat, 18 Apr 2026 14:41:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.28598
Title: Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition
Title（参考訳）: マスタオーケストレータとしての小型モデル:並列サブタスク分割による統一エージェントツールオーケストレーションの学習
Authors: Wenzhen Yuan, Wutao Xiong, Fanchen Yu, Shengji Tang, Ting Liu, Tao Chen, Peng Ye, Yuzhuo Fu, Wanli Ouyang, Lei Bai,
Abstract要約: Agent-as-Toolは並列オーケストレーションのパラダイムであり、エージェントとツールの両方を標準化された学習可能なアクション空間に緩和する。 ParaManagerは、サブタスク解決から計画決定を分離し、ステート対応の並列サブタスク分解、デリゲート、非同期実行を可能にする。実験により、ParaManagerは複数のベンチマークで高い性能を示し、目に見えないモデルプールの下で堅牢な一般化を示す。
参考スコア（独自算出の注目度）: 61.291733522717415
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-agent systems (MAS) demonstrate clear advantages in tackling complex problems by coordinating diverse agents and external tools. However, most existing orchestration methods rely on static workflows or serial agent scheduling, and are further constrained by heterogeneous interface protocols between tools and agents. This leads to high system complexity and poor extensibility. To mitigate these issues, we propose Agent-as-Tool, a unified parallel orchestration paradigm that abstracts both agents and tools into a standardized, learnable action space with protocol normalization and explicit state feedback. Building on this paradigm, we train a lightweight orchestrator, ParaManager, which decouples planning decisions from subtask solving, enabling state-aware parallel subtask decomposition, delegation, and asynchronous execution. For training, we adopt a two-stage ParaManager training pipeline. It improves robustness by incorporating supervised fine-tuning (SFT) trajectories equipped with recovery mechanisms, and further applies reinforcement learning (RL) to achieve an optimal balance among task success, protocol compliance, diversity, and reasoning efficiency. Experiments show that ParaManager achieves strong performance across multiple benchmarks and exhibits robust generalization under unseen model pools.
Abstract（参考訳）: マルチエージェントシステム(MAS)は,多様なエージェントや外部ツールを協調して複雑な問題に対処する上で,明確な利点を示す。しかし、既存のオーケストレーションメソッドの多くは静的ワークフローやシリアルエージェントスケジューリングに依存しており、ツールとエージェントの間の異種インターフェースプロトコルによってさらに制約されている。これは、システムの複雑さと拡張性の低下につながる。これらの問題を緩和するために,エージェントとツールの両方を標準化された学習可能なアクション空間に抽象化し,プロトコルの正規化と明示的な状態フィードバックを備えた並列オーケストレーションパラダイムであるAgent-as-Toolを提案する。このパラダイムに基づいて、我々は軽量なオーケストレータであるParaManagerをトレーニングし、サブタスク解決から計画決定を分離し、ステートアウェアな並列サブタスク分解、デリゲート、非同期実行を可能にします。トレーニングには、2段階のParaManagerトレーニングパイプラインを採用しています。また、リカバリ機構を備えた教師付き微調整(SFT)トラジェクトリを導入し、さらに強化学習(RL)を適用し、タスク成功、プロトコルコンプライアンス、多様性、推論効率の最適バランスを実現する。実験により、ParaManagerは複数のベンチマークで高い性能を示し、目に見えないモデルプールの下で堅牢な一般化を示す。

論文の概要: Small Model as Master Orchestrator: Learning Unified Agent-Tool Orchestration with Parallel Subtask Decomposition

関連論文リスト