Learning While Scheduling in Multi-Server Systems with Unknown
Statistics: MaxWeight with Discounted UCB
- URL: http://arxiv.org/abs/2209.01126v3
- Date: Fri, 2 Jun 2023 06:57:06 GMT
- Title: Learning While Scheduling in Multi-Server Systems with Unknown
Statistics: MaxWeight with Discounted UCB
- Authors: Zixian Yang, R. Srikant, Lei Ying
- Abstract summary: This paper considers a multi-server system with multiple servers and multiple types of jobs, where different job types require different amounts of processing time at different servers.
The goal is to schedule jobs on servers without knowing the statistics of the processing times.
We propose a new algorithm, which combines the MaxWeight scheduling policy with discounted upper confidence bound (UCB) to simultaneously learn statistics and schedule jobs to servers.
- Score: 18.898514227870926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-server queueing systems are widely used models for job scheduling in
machine learning, wireless networks, crowdsourcing, and healthcare systems.
This paper considers a multi-server system with multiple servers and multiple
types of jobs, where different job types require different amounts of
processing time at different servers. The goal is to schedule jobs on servers
without knowing the statistics of the processing times. To fully utilize the
processing power of the servers, it is known that one has to at least learn the
service rates of different job types on different servers. Prior works on this
topic decouple the learning and scheduling phases which leads to either
excessive exploration or extremely large job delays. We propose a new
algorithm, which combines the MaxWeight scheduling policy with discounted upper
confidence bound (UCB), to simultaneously learn the statistics and schedule
jobs to servers. We prove that under our algorithm the asymptotic average queue
length is bounded by one divided by the traffic slackness, which is order-wise
optimal. We also obtain an exponentially decaying probability tail bound for
any-time queue length. These results hold for both stationary and nonstationary
service rates. Simulations confirm that the delay performance of our algorithm
is several orders of magnitude better than previously proposed algorithms.
Related papers
- Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs [3.7758841366694353]
We survey scheduling techniques from the literature and from practical serving systems.
We find that schedulers from the literature often achieve good performance but introduce significant complexity.
In contrast, schedulers in practical deployments often leave easy performance gains on the table but are easy to implement, deploy and configure.
arXiv Detail & Related papers (2024-10-23T13:05:46Z) - Queueing Matching Bandits with Preference Feedback [10.988222071035198]
We consider multi-class asymmetric queueing systems consisting of $N$ queues on one side and $K$ servers on the other side.
The service rate of each job-server assignment is unknown and modeled by a feature-based Multi-nomial Logit (MNL) function.
We propose algorithms based on UCB and Thompson Sampling, which achieve system stability with an average queue length bound to $O(minN,K/epsilon)$ for a large time horizon.
arXiv Detail & Related papers (2024-10-14T02:29:06Z) - Efficient Reinforcement Learning for Routing Jobs in Heterogeneous Queueing Systems [21.944723061337267]
We consider the problem of efficiently routing jobs that arrive into a central queue to a system of heterogeneous servers.
Unlike homogeneous systems, a threshold policy, that routes jobs to the slow server(s) when the queue length exceeds a certain threshold, is known to be optimal for the one-fast-one-slow two-server system.
We propose ACHQ, an efficient policy gradient based algorithm with a low dimensional soft threshold policy parameterization.
arXiv Detail & Related papers (2024-02-02T05:22:41Z) - Online Optimization for Network Resource Allocation and Comparison with
Reinforcement Learning Techniques [0.6466206145151128]
We tackle in this paper an online network resource allocation problem with job transfers.
We propose a randomized online algorithm based on the exponentially weighted method.
We prove that our algorithm enjoys a sub-linear in time regret, which indicates that the algorithm is adapting and learning from its experiences.
arXiv Detail & Related papers (2023-11-16T17:08:27Z) - Partitioning Distributed Compute Jobs with Reinforcement Learning and
Graph Neural Networks [58.720142291102135]
Large-scale machine learning models are bringing advances to a broad range of fields.
Many of these models are too large to be trained on a single machine, and must be distributed across multiple devices.
We show that maximum parallelisation is sub-optimal in relation to user-critical metrics such as throughput and blocking rate.
arXiv Detail & Related papers (2023-01-31T17:41:07Z) - Multi-Job Intelligent Scheduling with Cross-Device Federated Learning [65.69079337653994]
Federated Learning (FL) enables collaborative global machine learning model training without sharing sensitive raw data.
We propose a novel multi-job FL framework, which enables the training process of multiple jobs in parallel.
We propose a novel intelligent scheduling approach based on multiple scheduling methods, including an original reinforcement learning-based scheduling method and an original Bayesian optimization-based scheduling method.
arXiv Detail & Related papers (2022-11-24T06:17:40Z) - Efficient Device Scheduling with Multi-Job Federated Learning [64.21733164243781]
We propose a novel multi-job Federated Learning framework to enable the parallel training process of multiple jobs.
We propose a reinforcement learning-based method and a Bayesian optimization-based method to schedule devices for multiple jobs while minimizing the cost.
Our proposed approaches significantly outperform baseline approaches in terms of training time (up to 8.67 times faster) and accuracy (up to 44.6% higher)
arXiv Detail & Related papers (2021-12-11T08:05:11Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Rosella: A Self-Driving Distributed Scheduler for Heterogeneous Clusters [7.206919625027208]
We present Rosella, a new self-driving, distributed approach for task scheduling in heterogeneous clusters.
Rosella automatically learns the compute environment and adjusts its scheduling policy in real-time.
We evaluate Rosella with a variety of workloads on a 32-node AWS cluster.
arXiv Detail & Related papers (2020-10-28T20:12:29Z) - Temporally Correlated Task Scheduling for Sequence Learning [143.70523777803723]
In many applications, a sequence learning task is usually associated with multiple temporally correlated auxiliary tasks.
We introduce a learnable scheduler to sequence learning, which can adaptively select auxiliary tasks for training.
Our method significantly improves the performance of simultaneous machine translation and stock trend forecasting.
arXiv Detail & Related papers (2020-07-10T10:28:54Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.