Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost
Efficiency
- URL: http://arxiv.org/abs/2402.12962v1
- Date: Tue, 20 Feb 2024 12:28:25 GMT
- Title: Multi-Level ML Based Burst-Aware Autoscaling for SLO Assurance and Cost
Efficiency
- Authors: Chunyang Meng, Haogang Tong, Tianyang Wu, Maolin Pan, Yang Yu
- Abstract summary: This paper introduces BAScaler, a Burst-Aware Autoscaling framework for containerized cloud services or applications under complex workloads.
BAScaler incorporates a novel prediction-based burst detection mechanism that distinguishes between predictable periodic workload spikes and actual bursts.
- Score: 3.5624365288866007
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autoscaling is a technology to automatically scale the resources provided to
their applications without human intervention to guarantee runtime Quality of
Service (QoS) while saving costs. However, user-facing cloud applications serve
dynamic workloads that often exhibit variable and contain bursts, posing
challenges to autoscaling for maintaining QoS within Service-Level Objectives
(SLOs). Conservative strategies risk over-provisioning, while aggressive ones
may cause SLO violations, making it more challenging to design effective
autoscaling. This paper introduces BAScaler, a Burst-Aware Autoscaling
framework for containerized cloud services or applications under complex
workloads, combining multi-level machine learning (ML) techniques to mitigate
SLO violations while saving costs. BAScaler incorporates a novel
prediction-based burst detection mechanism that distinguishes between
predictable periodic workload spikes and actual bursts. When bursts are
detected, BAScaler appropriately overestimates them and allocates resources
accordingly to address the rapid growth in resource demand. On the other hand,
BAScaler employs reinforcement learning to rectify potential inaccuracies in
resource estimation, enabling more precise resource allocation during
non-bursts. Experiments across ten real-world workloads demonstrate BAScaler's
effectiveness, achieving a 57% average reduction in SLO violations and cutting
resource costs by 10% compared to other prominent methods.
Related papers
- Self-Regulation and Requesting Interventions [63.5863047447313]
We propose an offline framework that trains a "helper" policy to request interventions.
We score optimal intervention timing with PRMs and train the helper model on these labeled trajectories.
This offline approach significantly reduces costly intervention calls during training.
arXiv Detail & Related papers (2025-02-07T00:06:17Z) - Secure Resource Allocation via Constrained Deep Reinforcement Learning [49.15061461220109]
We present SARMTO, a framework that balances resource allocation, task offloading, security, and performance.
SARMTO consistently outperforms five baseline approaches, achieving up to a 40% reduction in system costs.
These enhancements highlight SARMTO's potential to revolutionize resource management in intricate distributed computing environments.
arXiv Detail & Related papers (2025-01-20T15:52:43Z) - Reconciling High Accuracy, Cost-Efficiency, and Low Latency of Inference
Serving Systems [0.0]
InfAdapter proactively selects a set of ML model variants with their resource allocations to meet latency SLO.
It decreases SLO violation and costs up to 65% and 33%, respectively, compared to a popular industry autoscaler.
arXiv Detail & Related papers (2023-04-21T11:19:49Z) - Guaranteed Dynamic Scheduling of Ultra-Reliable Low-Latency Traffic via
Conformal Prediction [72.59079526765487]
The dynamic scheduling of ultra-reliable and low-latency traffic (URLLC) in the uplink can significantly enhance the efficiency of coexisting services.
The main challenge is posed by the uncertainty in the process of URLLC packet generation.
We introduce a novel scheduler for URLLC packets that provides formal guarantees on reliability and latency irrespective of the quality of the URLLC traffic predictor.
arXiv Detail & Related papers (2023-02-15T14:09:55Z) - TransPath: Learning Heuristics For Grid-Based Pathfinding via
Transformers [64.88759709443819]
We suggest learning the instance-dependent proxies that are supposed to notably increase the efficiency of the search.
The first proxy we suggest to learn is the correction factor, i.e. the ratio between the instance independent cost-to-go estimate and the perfect one.
The second proxy is the path probability, which indicates how likely the grid cell is lying on the shortest path.
arXiv Detail & Related papers (2022-12-22T14:26:11Z) - A Meta Reinforcement Learning Approach for Predictive Autoscaling in the
Cloud [10.970391043991363]
We propose an end-to-end predictive meta model-based RL algorithm, aiming to optimally allocate resource to maintain a stable CPU utilization level.
Our algorithm not only ensures the predictability and accuracy of the scaling strategy, but also enables the scaling decisions to adapt to the changing workloads with high sample efficiency.
arXiv Detail & Related papers (2022-05-31T13:54:04Z) - Learning Dynamic Mechanisms in Unknown Environments: A Reinforcement Learning Approach [123.55983746427572]
We propose novel learning algorithms to recover the dynamic Vickrey-Clarke-Grove (VCG) mechanism over multiple rounds of interaction.
A key contribution of our approach is incorporating reward-free online Reinforcement Learning (RL) to aid exploration over a rich policy space.
arXiv Detail & Related papers (2022-02-25T16:17:23Z) - PROMPT: Learning Dynamic Resource Allocation Policies for Network
Applications [16.812611987082082]
We propose PROMPT, a novel resource allocation framework using proactive prediction to guide a reinforcement learning controller.
We show that PROMPT incurs 4.2x fewer violations, reduces severity of policy violations by 12.7x, improves best-effort workload performance, and improves overall power efficiency over prior work.
arXiv Detail & Related papers (2022-01-19T23:34:34Z) - Coordinated Online Learning for Multi-Agent Systems with Coupled
Constraints and Perturbed Utility Observations [91.02019381927236]
We introduce a novel method to steer the agents toward a stable population state, fulfilling the given resource constraints.
The proposed method is a decentralized resource pricing method based on the resource loads resulting from the augmentation of the game's Lagrangian.
arXiv Detail & Related papers (2020-10-21T10:11:17Z) - A Predictive Autoscaler for Elastic Batch Jobs [8.354712625979776]
Large batch jobs such as Deep Learning, HPC and Spark require far more computational resources and higher cost than conventional online service.
We propose a predictive autoscaler to provide an elastic interface for the customers and overprovision instances.
arXiv Detail & Related papers (2020-10-10T17:35:55Z) - ReLeaSER: A Reinforcement Learning Strategy for Optimizing Utilization
Of Ephemeral Cloud Resources [2.205500582481277]
We propose a Reinforcement Learning strategy for optimizing the ephemeral resources' utilization in the cloud.
Our solution reduces significantly the SLA violation penalties on average by 2.7x and up to 3.4x.
It also improves considerably the CPs' potential savings by 27.6% on average and up to 43.6%.
arXiv Detail & Related papers (2020-09-23T15:19:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.