SkillOrchestra: Learning to Route Agents via Skill Transfer
- URL: http://arxiv.org/abs/2602.19672v1
- Date: Mon, 23 Feb 2026 10:17:25 GMT
- Title: SkillOrchestra: Learning to Route Agents via Skill Transfer
- Authors: Jiayu Wang, Yifei Ming, Zixuan Ke, Shafiq Joty, Aws Albarghouthi, Frederic Sala,
- Abstract summary: We introduce SkillOrchestra, a framework for skill-aware orchestration.<n>SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills.<n>At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off.
- Score: 65.50924963973286
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.
Related papers
- Organizing, Orchestrating, and Benchmarking Agent Skills at Ecosystem Scale [28.43462779191672]
AgentSkillOS is a principled framework for skill selection, orchestration, and ecosystem-level management.<n>AgentSkillOS comprises two stages: (i) Manage Skills, which organizes skills into a capability tree.<n> (ii) Solve Tasks, which retrieves, orchestrates, and executes multiple skills through DAG-based pipelines.
arXiv Detail & Related papers (2026-03-02T18:46:47Z) - K^2-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control [73.50217471850658]
K2-Agent is a hierarchical framework that models human-like cognition by knowing and co-evolving declarative (what) and procedural (how) knowledge for planning and execution.<n>On the challenging AndroidWorld benchmark, K2-Agent achieves a 76.1% success rate using only raw and open-source backbones.
arXiv Detail & Related papers (2026-02-28T14:33:14Z) - MAS-Orchestra: Understanding and Improving Multi-Agent Reasoning Through Holistic Orchestration and Controlled Benchmarks [86.05918381895555]
We propose MASOrchestra as a training-time framework that formulates MAS orchestration as a function-calling reinforcement learning problem.<n>In MAS-Orchestra, complex, goal-oriented subagents are abstracted as callable functions, enabling global reasoning over system structure.<n>Our analysis reveals that MAS gains depend critically on task structure, verification protocols, and the capabilities of both orchestrator and subagents.
arXiv Detail & Related papers (2026-01-21T04:57:02Z) - RollArt: Scaling Agentic RL Training via Disaggregated Infrastructure [49.88201789074532]
Agentic Reinforcement Learning (RL) enables Large Language Models (LLMs) to perform autonomous decision-making and long-term planning.<n>We present RollArc, a distributed system designed to maximize throughput for multi-task agentic RL on disaggregated infrastructure.
arXiv Detail & Related papers (2025-12-27T11:14:23Z) - ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration [110.24410841004777]
We show that small orchestrators managing other models and a variety of tools can both push the upper bound of intelligence.<n>We introduce ToolOrchestra, a method for training small orchestrators that coordinate intelligent tools.<n>Using ToolOrchestra, we produce Orchestrator, an 8B model that achieves higher accuracy at lower cost than previous tool-use agents.
arXiv Detail & Related papers (2025-11-26T18:59:46Z) - xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning [104.63494870852894]
We present x, a tool-calling-based routing system in which a learned router can either answer directly or invoke one or more external models.<n>Our implementation encompasses the full reinforcement learning framework, including reward and cost accounting.<n>Across diverse benchmarks, x achieves strong cost-performance trade-offs.
arXiv Detail & Related papers (2025-10-09T16:52:01Z) - When Should We Orchestrate Multiple Agents? [74.27052374196269]
Strategies for orchestrating the interactions between multiple agents, both human and artificial, can wildly overestimate performance and underestimate the cost of orchestration.<n>We design a framework to orchestrate agents under realistic conditions, such as inference costs or availability constraints.<n>We show theoretically that orchestration is only effective if there are performance or cost differentials between agents.
arXiv Detail & Related papers (2025-03-17T14:26:07Z) - Expert-Token Resonance MoE: Bidirectional Routing with Efficiency Affinity-Driven Active Selection [19.365009652356793]
Expert-Token Resonance (ETR) is a theoretically-grounded bidirectional routing mechanism that reimagines expert-token interactions.<n>ETR achieves 5.4%-46.6% improvements in end-to-end training efficiency compared to baseline MoE implementations.
arXiv Detail & Related papers (2024-05-24T02:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.