Related papers: ReAssigner: A Plug-and-Play Virtual Machine Scheduling Intensifier for Heterogeneous Requests

ReAssigner: A Plug-and-Play Virtual Machine Scheduling Intensifier for Heterogeneous Requests

URL: http://arxiv.org/abs/2211.16227v1
Date: Tue, 29 Nov 2022 14:05:06 GMT
Title: ReAssigner: A Plug-and-Play Virtual Machine Scheduling Intensifier for Heterogeneous Requests
Authors: Haochuan Cui, Junjie Sheng, Bo Jin, Yiqiu Hu, Li Su, Lei Zhu, Wenli Zhou, Xiangfeng Wang
Abstract summary: A virtual machine scheduling intensifier called Resource Assigner (Reer) is proposed to enhance scheduling efficiency of any given scheduler for heterogeneous requests. Reer achieves significant scheduling performance improvement compared with some state-of-the-art scheduling methods.
Score: 14.521969014581728
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid development of cloud computing, virtual machine scheduling has become one of the most important but challenging issues for the cloud computing community, especially for practical heterogeneous request sequences. By analyzing the impact of request heterogeneity on some popular heuristic schedulers, it can be found that existing scheduling algorithms can not handle the request heterogeneity properly and efficiently. In this paper, a plug-and-play virtual machine scheduling intensifier, called Resource Assigner (ReAssigner), is proposed to enhance the scheduling efficiency of any given scheduler for heterogeneous requests. The key idea of ReAssigner is to pre-assign roles to physical resources and let resources of the same role form a virtual cluster to handle homogeneous requests. ReAssigner can cooperate with arbitrary schedulers by restricting their scheduling space to virtual clusters. With evaluations on the real dataset from Huawei Cloud, the proposed ReAssigner achieves significant scheduling performance improvement compared with some state-of-the-art scheduling methods.

Related papers

Efficient LLM Serving on Hybrid Real-time and Best-effort Requests [0.6291443816903801]
BROS is a hybrid Language Models (LLMs) serving system that aims to collocate RT/BE requests, meeting RT requests' latency requirements while maintaining BE requests' throughput. It significantly reduces the latency of RT requests (up to 74.20%), improving their fine-grained service level objectives (SLOs) attainments (up to 36.38x), with negligible throughput reduction for BE requests.
arXiv Detail & Related papers (2025-04-13T14:16:57Z)
ALISE: Accelerating Large Language Model Serving with Speculative Scheduling [7.367068885621016]
Large Language Models (LLMs) represent a revolutionary advancement in the contemporary landscape of artificial general intelligence (AGI) In this paper, we propose a new efficient LLM inference serving framework, named ALISE. We show that ALISE improves the throughput of inference serving by up to 1.8x and 2.1x under the same latency constraint on the Alpaca and ShareGPT datasets, respectively.
arXiv Detail & Related papers (2024-10-31T00:58:11Z)
CoRaiS: Lightweight Real-Time Scheduler for Multi-Edge Cooperative Computing [32.99310493126955]
Multi-edge cooperative computing that combines constrained resources of multiple edges into a powerful resource pool has the potential to deliver great benefits. However, the mass heterogeneous resources composition and lack of scheduling strategies make the modeling and cooperating of multi-edge computing system particularly complicated. This paper first proposes a system-level state evaluation model to shield the complex hardware configurations and redefine the different service capabilities at heterogeneous edges.
arXiv Detail & Related papers (2024-02-04T07:21:45Z)
Client Orchestration and Cost-Efficient Joint Optimization for NOMA-Enabled Hierarchical Federated Learning [55.49099125128281]
We propose a non-orthogonal multiple access (NOMA) enabled HFL system under semi-synchronous cloud model aggregation. We show that the proposed scheme outperforms the considered benchmarks regarding HFL performance improvement and total cost reduction.
arXiv Detail & Related papers (2023-11-03T13:34:44Z)
Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints. Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z)
Sparsity-Aware Intelligent Massive Random Access Control in Open RAN: A Reinforcement Learning Based Approach [61.74489383629319]
Massive random access of devices in the emerging Open Radio Access Network (O-RAN) brings great challenge to the access control and management. reinforcement-learning (RL)-assisted scheme of closed-loop access control is proposed to preserve sparsity of access requests. Deep-RL-assisted SAUD is proposed to resolve highly complex environments with continuous and high-dimensional state and action spaces.
arXiv Detail & Related papers (2023-03-05T12:25:49Z)
Deep Reinforcement Model Selection for Communications Resource Allocation in On-Site Medical Care [8.564319625930892]
This paper explores a resource allocation scenario where a scheduler must balance mixed performance metrics among connected users. We present a scheduler that adaptively switches between different model-based scheduling algorithms. The resulting ensemble scheduler is able to combine its constituent algorithms to maximize a sum-utility cost function.
arXiv Detail & Related papers (2021-11-12T12:04:53Z)
Better than the Best: Gradient-based Improper Reinforcement Learning for Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay. We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
Tailored Learning-Based Scheduling for Kubernetes-Oriented Edge-Cloud System [54.588242387136376]
We introduce KaiS, a learning-based scheduling framework for edge-cloud systems. First, we design a coordinated multi-agent actor-critic algorithm to cater to decentralized request dispatch. Second, for diverse system scales and structures, we use graph neural networks to embed system state information. Third, we adopt a two-time-scale scheduling mechanism to harmonize request dispatch and service orchestration.
arXiv Detail & Related papers (2021-01-17T03:45:25Z)
Rosella: A Self-Driving Distributed Scheduler for Heterogeneous Clusters [7.206919625027208]
We present Rosella, a new self-driving, distributed approach for task scheduling in heterogeneous clusters. Rosella automatically learns the compute environment and adjusts its scheduling policy in real-time. We evaluate Rosella with a variety of workloads on a 32-node AWS cluster.
arXiv Detail & Related papers (2020-10-28T20:12:29Z)
A Machine Learning Approach for Task and Resource Allocation in Mobile Edge Computing Based Networks [108.57859531628264]
A joint task, spectrum, and transmit power allocation problem is investigated for a wireless network. The proposed algorithm can reduce the number of iterations needed for convergence and the maximal delay among all users by up to 18% and 11.1% compared to the standard Q-learning algorithm.
arXiv Detail & Related papers (2020-07-20T13:46:42Z)
AI-based Resource Allocation: Reinforcement Learning for Adaptive Auto-scaling in Serverless Environments [0.0]
Serverless computing has emerged as a compelling new paradigm of cloud computing models in recent years. A common approach among both commercial and open source serverless computing platforms is workload-based auto-scaling. In this paper we investigate the applicability of a reinforcement learning approach to request-based auto-scaling in a serverless framework.
arXiv Detail & Related papers (2020-05-29T06:18:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.