Related papers: MasRouter: Learning to Route LLMs for Multi-Agent Systems

MasRouter: Learning to Route LLMs for Multi-Agent Systems

URL: http://arxiv.org/abs/2502.11133v1
Date: Sun, 16 Feb 2025 14:00:59 GMT
Title: MasRouter: Learning to Route LLMs for Multi-Agent Systems
Authors: Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, Yiyan Qi,
Abstract summary: Multi-agent systems powered by Large Language Models (LLMs) have been demonstrated to push the boundaries of LLM capabilities.<n>Current routing methods effectively reduce overhead in single-agent scenarios by customizing LLM selection for each query.<n>We first introduce the problem of Multi-Agent Routing System (MASR), which integrates all components of MAS into a unified routing framework.<n>Mas is (1) high-performing, achieving a $1.8%sim8.2%$ improvement over the state-of-the-art method on MBPP; (2) economical, reducing overhead by up to $52.07%$ compared to S
Score: 14.029698552632107
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Multi-agent systems (MAS) powered by Large Language Models (LLMs) have been demonstrated to push the boundaries of LLM capabilities, yet they often incur significant costs and face challenges in dynamic LLM selection. Current LLM routing methods effectively reduce overhead in single-agent scenarios by customizing LLM selection for each query, but they overlook the critical decisions regarding collaboration modes and agent roles in MAS. In response to this challenge, we first introduce the problem of Multi-Agent System Routing (MASR), which integrates all components of MAS into a unified routing framework. Toward this goal, we propose MasRouter, the first high-performing, cost-effective, and inductive MASR solution. MasRouter employs collaboration mode determination, role allocation, and LLM routing through a cascaded controller network, progressively constructing a MAS that balances effectiveness and efficiency. Extensive experiments demonstrate that MasRouter is (1) high-performing, achieving a $1.8\%\sim8.2\%$ improvement over the state-of-the-art method on MBPP; (2) economical, reducing overhead by up to $52.07\%$ compared to SOTA methods on HumanEval; and (3) plug-and-play, seamlessly integrating with mainstream MAS frameworks, reducing overhead by $17.21\%\sim28.17\%$ via customized routing. The code is available at https://github.com/yanweiyue/masrouter.

Related papers

How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities [62.474732677086855]
Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance. We propose the DSC benchmark: Diverse, Simple, and Categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types.
arXiv Detail & Related papers (2025-03-20T19:52:30Z)
RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs [44.273794030829556]
This paper introduces RouterEval, a benchmark for router research that includes over 200,000,000 performance records for 12 popular LLM evaluations. Using RouterEval, extensive evaluations of existing Routing LLM methods reveal that most still have significant room for improvement.
arXiv Detail & Related papers (2025-03-08T04:07:07Z)
Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.<n>However, they still struggle with problems requiring multi-step decision-making and environmental feedback.<n>We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z)
Universal Model Routing for Efficient LLM Inference [72.65083061619752]
We consider the problem of dynamic routing, where new, previously unobserved LLMs are available at test time.<n>We propose a new approach to this problem that relies on representing each LLM as a feature vector, derived based on predictions on a set of representative prompts.<n>We prove that these strategies are estimates of a theoretically optimal routing rule, and provide an excess risk bound to quantify their errors.
arXiv Detail & Related papers (2025-02-12T20:30:28Z)
MixLLM: Dynamic Routing in Mixed Large Language Models [57.309520357563215]
Large Language Models (LLMs) exhibit potential artificial generic intelligence recently, however, their usage is costly with high response latency. We develop MixLLM, a dynamic contextual-bandit-based routing system for query-LLM assignment.
arXiv Detail & Related papers (2025-02-09T02:26:15Z)
Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization [61.02719787737867]
Large language models (LLMs) are increasingly deployed and democratized on edge devices.<n>One promising solution is uncertainty-based SLM routing, offloading high-stakes queries to stronger LLMs when resulting in low-confidence responses on SLM.<n>We conduct a comprehensive investigation into benchmarking and generalization of uncertainty-driven routing strategies from SLMs to LLMs over 1500+ settings.
arXiv Detail & Related papers (2025-02-06T18:59:11Z)
$γ-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models [87.43596173378913]
We propose an innovative strategy for existing MLLMs called $gamma$-MoD. In $gamma$-MoD, a novel metric is proposed to guide the deployment of MoDs in the MLLM. Based on ARank, we propose two novel designs to maximize the computational sparsity of MLLM.
arXiv Detail & Related papers (2024-10-17T17:59:53Z)
RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models [24.113223576205932]
We show that query-based Router by Dual Contrastive learning (DC) is effective in assembling large language models (LLMs) DC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution and out-of-distribution tasks.
arXiv Detail & Related papers (2024-09-30T02:31:40Z)
TensorOpera Router: A Multi-Model Router for Efficient LLM Inference [27.2803289964386]
TO-lemma is a non-monolithic LLM querying system. It seamlessly integrates various LLM experts into a single query interface. It dynamically routes incoming queries to the most high-performant expert based on query's requirements.
arXiv Detail & Related papers (2024-08-22T11:57:07Z)
RouterBench: A Benchmark for Multi-LLM Routing System [25.515453832224804]
No single model can optimally address all tasks and applications, particularly when balancing performance with cost. This limitation has led to the development of LLM routing systems, which combine the strengths of various models to overcome the constraints of individual LLMs. We present RouterBench, a novel evaluation framework designed to systematically assess the efficacy of LLM routing systems.
arXiv Detail & Related papers (2024-03-18T17:59:04Z)
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [80.10358123795946]
We develop a framework for building multi-turn RL algorithms for fine-tuning large language models. Our framework adopts a hierarchical RL approach and runs two RL algorithms in parallel. Empirically, we find that ArCHer significantly improves efficiency and performance on agent tasks.
arXiv Detail & Related papers (2024-02-29T18:45:56Z)
A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration [55.35849138235116]
We propose automatically selecting a team of agents from candidates to collaborate in a dynamic communication structure toward different tasks and domains. Specifically, we build a framework named Dynamic LLM-Powered Agent Network ($textDyLAN$) for LLM-powered agent collaboration. We demonstrate that DyLAN outperforms strong baselines in code generation, decision-making, general reasoning, and arithmetic reasoning tasks with moderate computational cost.
arXiv Detail & Related papers (2023-10-03T16:05:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.