CONCUR: A Framework for Continual Constrained and Unconstrained Routing
- URL: http://arxiv.org/abs/2512.09386v1
- Date: Wed, 10 Dec 2025 07:30:13 GMT
- Title: CONCUR: A Framework for Continual Constrained and Unconstrained Routing
- Authors: Peter Baile Chen, Weiyue Li, Dan Roth, Michael Cafarella, Samuel Madden, Jacob Andreas,
- Abstract summary: AI tasks differ in complexity and are best addressed with different computation strategies.<n>Most prior methods build the routing framework by training a single model across all strategies.<n>We propose CONCUR, a continual routing framework that supports both constrained and unconstrained routing.
- Score: 79.85419373937765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: AI tasks differ in complexity and are best addressed with different computation strategies (e.g., combinations of models and decoding methods). Hence, an effective routing system that maps tasks to the appropriate strategies is crucial. Most prior methods build the routing framework by training a single model across all strategies, which demands full retraining whenever new strategies appear and leads to high overhead. Attempts at such continual routing, however, often face difficulties with generalization. Prior models also typically use a single input representation, limiting their ability to capture the full complexity of the routing problem and leading to sub-optimal routing decisions. To address these gaps, we propose CONCUR, a continual routing framework that supports both constrained and unconstrained routing (i.e., routing with or without a budget). Our modular design trains a separate predictor model for each strategy, enabling seamless incorporation of new strategies with low additional training cost. Our predictors also leverage multiple representations of both tasks and computation strategies to better capture overall problem complexity. Experiments on both in-distribution and out-of-distribution, knowledge- and reasoning-intensive tasks show that our method outperforms the best single strategy and strong existing routing techniques with higher end-to-end accuracy and lower inference cost in both continual and non-continual settings, while also reducing training cost in the continual setting.
Related papers
- Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts [56.02203242609604]
Large language models (LLMs) fine-tuned with lightweight adapters achieve strong performance across diverse tasks.<n>Fusing independently trained models with different strengths has shown promise for multi-task learning through three main strategies.<n>We empirically evaluate their trade-offs, addressing two key questions: What are the advantages of going beyond uniform ensembling or merging? and does the flexibility of routing justify its complexity?
arXiv Detail & Related papers (2026-03-03T21:44:11Z) - Budget-Aware Agentic Routing via Boundary-Guided Training [24.0709108941881]
Budget-Aware Agentic Routing selects between a cheap and an expensive model at each step to optimize the cost-success frontier.<n> Boundary-Guided Training builds a difficulty taxonomy to anchor learning under sparse rewards.<n>Experiment results show that our method improves the efficiency frontier, matching strong routing baselines at substantially lower cost.
arXiv Detail & Related papers (2026-02-04T07:39:27Z) - xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning [104.63494870852894]
We present x, a tool-calling-based routing system in which a learned router can either answer directly or invoke one or more external models.<n>Our implementation encompasses the full reinforcement learning framework, including reward and cost accounting.<n>Across diverse benchmarks, x achieves strong cost-performance trade-offs.
arXiv Detail & Related papers (2025-10-09T16:52:01Z) - Plan before Solving: Problem-Aware Strategy Routing for Mathematical Reasoning with LLMs [49.995906301946]
Existing methods usually leverage a fixed strategy to guide Large Language Models (LLMs) to perform mathematical reasoning.<n>Our analysis reveals that the single strategy cannot adapt to problem-specific requirements and thus overlooks the trade-off between effectiveness and efficiency.<n>We propose Planning and Routing through Instance-Specific Modeling (PRISM), a novel framework that decouples mathematical reasoning into two stages: strategy planning and targeted execution.
arXiv Detail & Related papers (2025-09-29T07:22:41Z) - Route to Reason: Adaptive Routing for LLM and Reasoning Strategy Selection [7.045509749924679]
Route-To-Reason (RTR) is a novel unified routing framework that dynamically allocates both LMs and reasoning strategies according to task difficulty under budget constraints.<n>RTR learns compressed representations of both expert models and reasoning strategies, enabling their joint and adaptive selection at inference time.
arXiv Detail & Related papers (2025-05-26T02:53:17Z) - Rethinking Predictive Modeling for LLM Routing: When Simple kNN Beats Complex Learned Routers [3.090041654375235]
We show that a well-tuned k-Nearest Neighbors (kNN) approach outperforms state-of-the-art learned routers across diverse tasks.<n>Our findings reveal that the locality properties of model performance in embedding space enable simple non-parametric methods to achieve strong routing decisions.
arXiv Detail & Related papers (2025-05-19T01:33:41Z) - Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms [53.75036695728983]
Vehicle Routing Problems (VRP) are a fundamental NP-hard challenge in Evolutionary optimization.<n>We introduce an optimization framework where a reinforcement learning agent is trained on prior instances and quickly generates initial solutions.<n>This framework consistently outperforms current state-of-the-art solvers across various time budgets.
arXiv Detail & Related papers (2025-04-08T15:21:01Z) - A Unified Approach to Routing and Cascading for LLMs [5.653106385738822]
Large language models (LLMs) embedded in various agentic systems have increased the potential of model selection strategies to improve the cost-performance tradeoff.<n>Existing strategies involve either routing, where a single model is chosen per query, or cascading, which sequentially runs increasingly larger models until a satisfactory answer is found.<n>We derive a novel optimal strategy for cascading and prove the optimality of an existing routing strategy.<n>We propose cascade routing, a unified framework that integrates routing and cascading into a theoretically optimal strategy.
arXiv Detail & Related papers (2024-10-14T10:00:49Z) - An Efficient Learning-based Solver Comparable to Metaheuristics for the
Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics.
First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process.
Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z) - Ranking Cost: Building An Efficient and Scalable Circuit Routing Planner
with Evolution-Based Optimization [49.207538634692916]
We propose a new algorithm for circuit routing, named Ranking Cost, to form an efficient and trainable router.
In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths.
Our algorithm is trained in an end-to-end manner and does not use any artificial data or human demonstration.
arXiv Detail & Related papers (2021-10-08T07:22:45Z) - Multi-Task Reinforcement Learning with Soft Modularization [25.724764855681137]
Multi-task learning is a very challenging problem in reinforcement learning.
We introduce an explicit modularization technique on policy representation to alleviate this optimization issue.
We show our method improves both sample efficiency and performance over strong baselines by a large margin.
arXiv Detail & Related papers (2020-03-30T17:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.