Related papers: Learning payoffs while routing in skill-based queues

Learning payoffs while routing in skill-based queues

URL: http://arxiv.org/abs/2412.10168v1
Date: Fri, 13 Dec 2024 14:33:50 GMT
Title: Learning payoffs while routing in skill-based queues
Authors: Sanne van Kempen, Jaron Sanders, Fiona Sloothaak, Maarten G. Wolf,
Abstract summary: We construct a machine learning algorithm that adaptively learns the payoff parameters while maximizing the total payoff.<n>We show that the algorithm is optimal up to logarithmic terms by deriving a regret lower bound.
Score: 0.4077787659104315
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Motivated by applications in service systems, we consider queueing systems where each customer must be handled by a server with the right skill set. We focus on optimizing the routing of customers to servers in order to maximize the total payoff of customer--server matches. In addition, customer--server dependent payoff parameters are assumed to be unknown a priori. We construct a machine learning algorithm that adaptively learns the payoff parameters while maximizing the total payoff and prove that it achieves polylogarithmic regret. Moreover, we show that the algorithm is asymptotically optimal up to logarithmic terms by deriving a regret lower bound. The algorithm leverages the basic feasible solutions of a static linear program as the action space. The regret analysis overcomes the complex interplay between queueing and learning by analyzing the convergence of the queue length process to its stationary behavior. We also demonstrate the performance of the algorithm numerically, and have included an experiment with time-varying parameters highlighting the potential of the algorithm in non-static environments.

Related papers

Queueing-Aware Optimization of Reasoning Tokens for Accuracy-Latency Trade-offs in LLM Servers [4.3400407844814985]
We consider a single large language model (LLM) server that serves a heterogeneous stream of queries belonging to $N$ distinct task types.<n>For each task type, the server allocates a fixed number of internal thinking tokens, which determines the computational effort devoted to that query.<n>We formulate a constrained optimization problem that maximizes a weighted average accuracy objective penalized by the mean system time.
arXiv Detail & Related papers (2026-01-15T10:47:11Z)
Demonstration of effective UCB-based routing in skill-based queues on real-world data [0.4077787659104315]
This paper is about optimally controlling skill-based queueing systems such as data centers, cloud computing networks, and a service systems.<n>By means of a case study using a real-world set data, we investigate the practical implementation of a recently developed reinforcement learning for optimal customer routing.
arXiv Detail & Related papers (2025-06-25T15:36:43Z)
Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$. Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z)
Oracle-Efficient Smoothed Online Learning for Piecewise Continuous Decision Making [73.48977854003697]
This work introduces a new notion of complexity, the generalized bracketing numbers, which marries constraints on the adversary to the size of the space. We then instantiate our bounds in several problems of interest, including online prediction and planning of piecewise continuous functions.
arXiv Detail & Related papers (2023-02-10T18:45:52Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements. We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design. We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z)
Scheduling Servers with Stochastic Bilinear Rewards [7.519872646378837]
A system optimization problem arises in multi-class, multi-server queueing system scheduling. We propose a scheduling algorithm based on weighted proportional fair allocation criteria augmented with marginal costs for reward. Our algorithm sub-linear regret and sublinear mean holding cost (and queue length bound) with respect to the time horizon, thus guaranteeing queueing system stability.
arXiv Detail & Related papers (2021-12-13T00:37:20Z)
Near-Linear Time Algorithm with Near-Logarithmic Regret Per Switch for Mixable/Exp-Concave Losses [0.0]
We study the online optimization of mixable loss functions with logarithmic static regret in a dynamic environment. For the first time in literature, we show that it is also possible to achieve near-logarithmic regret per switch with sub-polynomial complexity per time.
arXiv Detail & Related papers (2021-09-28T15:06:31Z)
Learning to Schedule [3.5408022972081685]
This paper proposes a learning and scheduling algorithm to minimize the expected cumulative holding cost incurred by jobs. In each time slot, the server can process a job while receiving the realized random holding costs of the jobs remaining in the system.
arXiv Detail & Related papers (2021-05-28T08:04:06Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Learning to Sparsify Travelling Salesman Problem Instances [0.5985204759362747]
We use a pruning machine learning as a pre-processing step followed by an exact Programming approach to sparsify the travelling salesman problem. Our learning approach requires very little training data and is amenable to mathematical analysis.
arXiv Detail & Related papers (2021-04-19T14:35:14Z)
An online learning approach to dynamic pricing and capacity sizing in service systems [26.720986177499338]
We study a dynamic pricing and capacity sizing problem in a $GI/GI/1$ queue. Our framework is dubbed Gradient-based Online Learning in Queue (GOLiQ)
arXiv Detail & Related papers (2020-09-07T07:17:20Z)
Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models. This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z)
Learning with Differentiable Perturbed Optimizers [54.351317101356614]
We propose a systematic method to transform operations into operations that are differentiable and never locally constant. Our approach relies on perturbeds, and can be used readily together with existing solvers. We show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks.
arXiv Detail & Related papers (2020-02-20T11:11:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.