Related papers: Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations

Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations

URL: http://arxiv.org/abs/2601.22318v1
Date: Thu, 29 Jan 2026 21:00:29 GMT
Title: Federate the Router: Learning Language Model Routers with Sparse and Decentralized Evaluations
Authors: Baris Askin, Shivam Patel, Anupam Nayak, Andrea Vigano, Jiin Woo, Gauri Joshi, Carlee Joe-Wong,
Abstract summary: Large language models (LLMs) are increasingly accessed as remotely hosted services by edge and enterprise clients.<n>Existing router approaches assume access to centralized query-model evaluation data.<n>We introduce the first federated framework for LLM routing, enabling clients to learn a shared routing policy from local offline query-model evaluation data.
Score: 26.24858921328445
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) are increasingly accessed as remotely hosted services by edge and enterprise clients that cannot run frontier models locally. Since models vary widely in capability and price, routing queries to models that balance quality and inference cost is essential. Existing router approaches assume access to centralized query-model evaluation data. However, these data are often fragmented across clients, such as end users and organizations, and are privacy-sensitive, which makes centralizing data infeasible. Additionally, per-client router training is ineffective since local evaluation data is limited and covers only a restricted query distribution and a biased subset of model evaluations. We introduce the first federated framework for LLM routing, enabling clients to learn a shared routing policy from local offline query-model evaluation data. Our framework supports both parametric multilayer perceptron router and nonparametric K-means router under heterogeneous client query distributions and non-uniform model coverage. Across two benchmarks, federated collaboration improves the accuracy-cost frontier over client-local routers, both via increased effective model coverage and better query generalization. Our theoretical results also validate that federated training reduces routing suboptimality.

Related papers

Towards Fair and Comprehensive Evaluation of Routers in Collaborative LLM Systems [46.00150374727385]
Large language models (LLMs) have achieved success, but cost and privacy constraints necessitate deploying smaller models locally.<n>We propose RouterXBench, a principled evaluation framework with three dimensions: router ability, scenario alignment, and cross-domain robustness.<n>We introduce ProbeDirichlet, a lightweight router that aggregates cross-layer hidden states via learnable Dirichlet with probabilistic training.
arXiv Detail & Related papers (2026-02-12T12:28:27Z)
Learning to Route LLMs from Bandit Feedback: One Policy, Many Trade-offs [69.2486294522259]
BaRP is a Bandit Routing-feedback with Preferences approach that trains under the same partial-feedback restriction as deployment.<n> Framed as a contextual bandit over prompt features and a user preference vector, our method simulates an online feedback setting during training and adapts its routing decisions to each new prompt.
arXiv Detail & Related papers (2025-10-08T18:24:59Z)
Meta-Router: Bridging Gold-standard and Preference-based Evaluations in Large Language Model Routing [15.724480880994259]
A large language model (LLM) router selects the most appropriate model from a pool of candidates for each query.<n> preference-based data, collected via crowdsourcing or LLM-as-a-judge systems, are cheaper and more scalable, yet often biased in reflecting the true quality of responses.<n>We develop an integrative causal router training framework that corrects preference-data bias, address imbalances between two data sources, and improve routing robustness and efficiency.
arXiv Detail & Related papers (2025-09-29T21:44:00Z)
FEDEXCHANGE: Bridging the Domain Gap in Federated Object Detection for Free [58.34974215853841]
Federated Object Detection (FOD) enables clients to collaboratively train a global object detection model without accessing their local data from diverse domains.<n>Existing FOD methods often overlook the hardware constraints of edge devices and introduce local training regularizations that incur high computational costs.<n>We propose FEDEXCHANGE, a novel FOD framework that bridges domain gaps without introducing additional local computational overhead.
arXiv Detail & Related papers (2025-09-01T17:39:25Z)
Don't Reach for the Stars: Rethinking Topology for Resilient Federated Learning [1.3270838622986498]
Federated learning (FL) enables collaborative model training across distributed clients while preserving data privacy by keeping data local.<n>Traditional FL approaches rely on a centralized, star-shaped topology, where a central server aggregates model updates from clients.<n>We propose a decentralized, peer-to-peer (P2P) FL framework to enable each client to identify and aggregate a personalized set of trustworthy and beneficial updates.
arXiv Detail & Related papers (2025-08-07T10:10:37Z)
How Robust Are Router-LLMs? Analysis of the Fragility of LLM Routing Capabilities [62.474732677086855]
Large language model (LLM) routing has emerged as a crucial strategy for balancing computational costs with performance.<n>We propose the DSC benchmark: Diverse, Simple, and Categorized, an evaluation framework that categorizes router performance across a broad spectrum of query types.
arXiv Detail & Related papers (2025-03-20T19:52:30Z)
FedRand: Enhancing Privacy in Federated Learning with Randomized LoRA Subparameter Updates [58.18162789618869]
Federated Learning (FL) is a widely used framework for training models in a decentralized manner.<n>We propose the FedRand framework, which avoids disclosing the full set of client parameters.<n>We empirically validate that FedRand improves robustness against MIAs compared to relevant baselines.
arXiv Detail & Related papers (2025-03-10T11:55:50Z)
Mobilizing Personalized Federated Learning in Infrastructure-Less and Heterogeneous Environments via Random Walk Stochastic ADMM [0.14597673707346284]
This paper explores the challenges of implementing Federated Learning (FL) in practical scenarios featuring isolated nodes with data heterogeneity. To overcome these challenges, we propose a novel mobilizing personalized FL approach, which aims to facilitate mobility and resilience. We develop a novel optimization algorithm called Random Walk Alternating Direction Method of Multipliers (RWSADMM)
arXiv Detail & Related papers (2023-04-25T03:00:18Z)
Optimizing Server-side Aggregation For Robust Federated Learning via Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems. We propose SmartFL, a generic approach that optimize the server-side aggregation process. We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z)
RingFed: Reducing Communication Costs in Federated Learning on Non-IID Data [3.7416826310878024]
Federated learning is used to protect the privacy of each client by exchanging model parameters rather than raw data. This article proposes RingFed, a novel framework to reduce communication overhead during the training process of federated learning. Experiments on two different public datasets show that RingFed has fast convergence, high model accuracy, and low communication cost.
arXiv Detail & Related papers (2021-07-19T13:43:10Z)
A Bayesian Federated Learning Framework with Online Laplace Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model. We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side. We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.