QoS-Aware Power Minimization of Distributed Many-Core Servers using
Transfer Q-Learning
- URL: http://arxiv.org/abs/2102.01348v1
- Date: Tue, 2 Feb 2021 06:47:58 GMT
- Title: QoS-Aware Power Minimization of Distributed Many-Core Servers using
Transfer Q-Learning
- Authors: Dainius Jenkus, Fei Xia, Rishad Shafik, Alex Yakovlev
- Abstract summary: This paper presents a runtime-aware controller using horizontal scaling (node allocation) and vertical scaling (resource allocation within nodes)
A horizontal scaling determines the number of active nodes based on workload demands and the required scalable according to a set of rules.
Then, it is coupled with vertical scaling using transfer Q-learning, which tunes power/performance based on workload profile using dynamic voltage/frequency scaling (DVFS)
When combined, these methods allow to reduce the exploration time and violations when compared to model-free Q-learning.
- Score: 8.123268089072523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Web servers scaled across distributed systems necessitate complex runtime
controls for providing quality of service (QoS) guarantees as well as
minimizing the energy costs under dynamic workloads. This paper presents a
QoS-aware runtime controller using horizontal scaling (node allocation) and
vertical scaling (resource allocation within nodes) methods synergistically to
provide adaptation to workloads while minimizing the power consumption under
QoS constraint (i.e., response time). A horizontal scaling determines the
number of active nodes based on workload demands and the required QoS according
to a set of rules. Then, it is coupled with vertical scaling using transfer
Q-learning, which further tunes power/performance based on workload profile
using dynamic voltage/frequency scaling (DVFS). It transfers Q-values within
minimally explored states reducing exploration requirements. In addition, the
approach exploits a scalable architecture of the many-core server allowing to
reuse available knowledge from fully or partially explored nodes. When
combined, these methods allow to reduce the exploration time and QoS violations
when compared to model-free Q-learning. The technique balances design-time and
runtime costs to maximize the portability and operational optimality
demonstrated through persistent power reductions with minimal QoS violations
under different workload scenarios on heterogeneous multi-processing nodes of a
server cluster.
Related papers
- Benchmarking Dynamic SLO Compliance in Distributed Computing Continuum Systems [9.820223170841219]
Service Level Objectives (SLOs) in large-scale architectures are challenging due to their heterogeneous nature and varying service requirements.
We present a benchmark of Active Inference -- an emerging method from neuroscience -- against three established reinforcement learning algorithms.
We find that Active Inference is a promising approach for ensuring SLO compliance in DCCS, offering lower memory usage, stable CPU utilization, and fast convergence.
arXiv Detail & Related papers (2025-03-05T08:56:26Z) - AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer [54.713778961605115]
Vision Transformer (ViT) has become one of the most prevailing fundamental backbone networks in the computer vision community.
We propose a novel non-uniform quantizer, dubbed the Adaptive Logarithm AdaLog (AdaLog) quantizer.
arXiv Detail & Related papers (2024-07-17T18:38:48Z) - Elastic Entangled Pair and Qubit Resource Management in Quantum Cloud
Computing [73.7522199491117]
Quantum cloud computing (QCC) offers a promising approach to efficiently provide quantum computing resources.
The fluctuations in user demand and quantum circuit requirements are challenging for efficient resource provisioning.
We propose a resource allocation model to provision quantum computing and networking resources.
arXiv Detail & Related papers (2023-07-25T00:38:46Z) - Generalizable Resource Scaling of 5G Slices using Constrained
Reinforcement Learning [2.0024258465343268]
Network slicing is a key enabler for 5G to support various applications.
It is imperative that the 5G infrastructure provider (InP) allocates the right amount of resources depending on the slice's traffic.
arXiv Detail & Related papers (2023-06-15T17:16:34Z) - Matching Game for Optimized Association in Quantum Communication
Networks [65.16483325184237]
This paper proposes a swap-stable request-QS association algorithm for quantum switches.
It achieves a near-optimal (within 5%) performance in terms of the percentage of served requests.
It is shown to be scalable and maintain its near-optimal performance even when the size of the QCN increases.
arXiv Detail & Related papers (2023-05-22T03:39:18Z) - Adaptive Federated Pruning in Hierarchical Wireless Networks [69.6417645730093]
Federated Learning (FL) is a privacy-preserving distributed learning framework where a server aggregates models updated by multiple devices without accessing their private datasets.
In this paper, we introduce model pruning for HFL in wireless networks to reduce the neural network scale.
We show that our proposed HFL with model pruning achieves similar learning accuracy compared with the HFL without model pruning and reduces about 50 percent communication cost.
arXiv Detail & Related papers (2023-05-15T22:04:49Z) - Scaling Limits of Quantum Repeater Networks [62.75241407271626]
Quantum networks (QNs) are a promising platform for secure communications, enhanced sensing, and efficient distributed quantum computing.
Due to the fragile nature of quantum states, these networks face significant challenges in terms of scalability.
In this paper, the scaling limits of quantum repeater networks (QRNs) are analyzed.
arXiv Detail & Related papers (2023-05-15T14:57:01Z) - Monitoring and Proactive Management of QoS Levels in Pervasive
Applications [9.289846887298852]
Edge Computing (EC) provides multiple computation and analytics capabilities close to data sources.
The expectation of ensuring high levels of execution imposes strict requirements for innovative management approaches.
We elaborate a distributed and intelligent decision-making approach for tasks scheduling.
We propose that nodes continuously monitor levels and systematically evaluate the probability of violating them to proactively decide some tasks to be offloaded to peer nodes or Cloud.
arXiv Detail & Related papers (2022-06-11T09:27:47Z) - MCDS: AI Augmented Workflow Scheduling in Mobile Edge Cloud Computing
Systems [12.215537834860699]
Recently proposed scheduling methods leverage the low response times of edge computing platforms to optimize application Quality of Service (QoS)
We propose MCDS: Monte Carlo Learning using Deep Surrogate Models to efficiently schedule workflow applications in mobile edge-cloud computing systems.
arXiv Detail & Related papers (2021-12-14T10:00:01Z) - Accelerating variational quantum algorithms with multiple quantum
processors [78.36566711543476]
Variational quantum algorithms (VQAs) have the potential of utilizing near-term quantum machines to gain certain computational advantages.
Modern VQAs suffer from cumbersome computational overhead, hampered by the tradition of employing a solitary quantum processor to handle large data.
Here we devise an efficient distributed optimization scheme, called QUDIO, to address this issue.
arXiv Detail & Related papers (2021-06-24T08:18:42Z) - AI-based Resource Allocation: Reinforcement Learning for Adaptive
Auto-scaling in Serverless Environments [0.0]
Serverless computing has emerged as a compelling new paradigm of cloud computing models in recent years.
A common approach among both commercial and open source serverless computing platforms is workload-based auto-scaling.
In this paper we investigate the applicability of a reinforcement learning approach to request-based auto-scaling in a serverless framework.
arXiv Detail & Related papers (2020-05-29T06:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.