R2-Router: A New Paradigm for LLM Routing with Reasoning
- URL: http://arxiv.org/abs/2602.02823v1
- Date: Mon, 02 Feb 2026 21:23:51 GMT
- Title: R2-Router: A New Paradigm for LLM Routing with Reasoning
- Authors: Jiaqi Xue, Qian Lou, Jiarong Xing, Heng Huang,
- Abstract summary: We show that R2- achieves state-of-the-art performance at 4-5x lower cost compared with existing routers.<n>This work opens a new direction: routing as reasoning, where routers evolve from reactive selectors to deliberate reasoners.
- Score: 58.929817721828194
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As LLMs proliferate with diverse capabilities and costs, LLM routing has emerged by learning to predict each LLM's quality and cost for a given query, then selecting the one with high quality and low cost. However, existing routers implicitly assume a single fixed quality and cost per LLM for each query, ignoring that the same LLM's quality varies with its output length. This causes routers to exclude powerful LLMs when their estimated cost exceeds the budget, missing the opportunity that these LLMs could still deliver high quality at reduced cost with shorter outputs. To address this, we introduce R2-Router, which treats output length budget as a controllable variable and jointly selects the best LLM and length budget, enforcing the budget via length-constrained instructions. This enables R2-Router to discover that a powerful LLM with constrained output can outperform a weaker LLM at comparable cost-efficient configurations invisible to prior methods. Together with the router framework, we construct R2-Bench, the first routing dataset capturing LLM behavior across diverse output length budgets. Experiments show that R2-Router achieves state-of-the-art performance at 4-5x lower cost compared with existing routers. This work opens a new direction: routing as reasoning, where routers evolve from reactive selectors to deliberate reasoners that explore which LLM to use and at what cost budget.
Related papers
- Dr.LLM: Dynamic Layer Routing in LLMs [55.11953638340419]
Dr.LLM is a retrofittable framework that equips pretrained models with lightweight per-layer routers deciding to skip, execute, or repeat a block.<n>On ARC (logic) and DART (math), Dr.LLM improves accuracy by up to +3.4%p while saving 5 layers per example on average.
arXiv Detail & Related papers (2025-10-14T17:51:26Z) - xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning [104.63494870852894]
We present x, a tool-calling-based routing system in which a learned router can either answer directly or invoke one or more external models.<n>Our implementation encompasses the full reinforcement learning framework, including reward and cost accounting.<n>Across diverse benchmarks, x achieves strong cost-performance trade-offs.
arXiv Detail & Related papers (2025-10-09T16:52:01Z) - RadialRouter: Structured Representation for Efficient and Robust Large Language Models Routing [27.481573948464987]
Radial is a novel framework for large language models routing.<n>It uses a lightweight Transformer-based backbone with a radial structure named RadialFormer to articulate the query-LLMs relationship.<n>It significantly outperforms existing routing methods by 9.2% and 5.8% in the Balance and Cost First scenarios.
arXiv Detail & Related papers (2025-06-04T12:16:41Z) - RouterEval: A Comprehensive Benchmark for Routing LLMs to Explore Model-level Scaling Up in LLMs [45.93874913792025]
We show a novel model-level scaling up phenomenon in routing large language models (LLMs)<n>This improvement can even surpass the performance of the best single model in the pool and many existing strong LLMs.<n>We introduce RouterEval, a benchmark tailored for router research, which includes over 200,000,000 performance records for 12 popular LLM evaluations.
arXiv Detail & Related papers (2025-03-08T04:07:07Z) - MasRouter: Learning to Route LLMs for Multi-Agent Systems [14.029698552632107]
Multi-agent systems powered by Large Language Models (LLMs) have been demonstrated to push the boundaries of LLM capabilities.<n>Current routing methods effectively reduce overhead in single-agent scenarios by customizing LLM selection for each query.<n>We first introduce the problem of Multi-Agent Routing System (MASR), which integrates all components of MAS into a unified routing framework.<n>Mas is (1) high-performing, achieving a $1.8%sim8.2%$ improvement over the state-of-the-art method on MBPP; (2) economical, reducing overhead by up to $52.07%$ compared to S
arXiv Detail & Related papers (2025-02-16T14:00:59Z) - Universal Model Routing for Efficient LLM Inference [69.86195589350264]
Model routing is a technique for reducing the inference cost of large language models (LLMs)<n>We propose UniRoute, a new approach to the problem of dynamic routing, where new, previously unobserved LLMs are available at test time.<n>We show that these are estimates of a theoretically optimal routing rule, and quantify their errors via an excess risk bound.
arXiv Detail & Related papers (2025-02-12T20:30:28Z) - MixLLM: Dynamic Routing in Mixed Large Language Models [57.309520357563215]
Large Language Models (LLMs) exhibit potential artificial generic intelligence recently, however, their usage is costly with high response latency.<n>We develop MixLLM, a dynamic contextual-bandit-based routing system for query-LLM assignment.
arXiv Detail & Related papers (2025-02-09T02:26:15Z) - CARROT: A Cost Aware Rate Optimal Router [22.786863130994217]
We introduce CARROT, a Cost AwaRe Rate Optimal rouTer that selects a model based on estimates of the models' cost and performance.<n>We empirically validate CARROT's performance against several alternative routers.
arXiv Detail & Related papers (2025-02-05T15:17:25Z) - Rerouting LLM Routers [27.16232746301828]
LLM routers balance quality and cost of generation by classifying queries and routing them to a cheaper or more expensive LLM depending on their complexity.<n>In this paper, we investigate routers' adversarial robustness.
arXiv Detail & Related papers (2025-01-03T14:03:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.