PBScaler: A Bottleneck-aware Autoscaling Framework for
Microservice-based Applications
- URL: http://arxiv.org/abs/2303.14620v3
- Date: Mon, 25 Dec 2023 11:58:34 GMT
- Title: PBScaler: A Bottleneck-aware Autoscaling Framework for
Microservice-based Applications
- Authors: Shuaiyu Xie, Jian Wang, Bing Li, Zekun Zhang, Duantengchuan Li,
Patrick C. K. H
- Abstract summary: We propose PBScaler, a bottleneck-aware autoscaling framework for microservice-based applications.
We show that PBScaler outperforms existing approaches while conserving resources efficiently.
- Score: 6.453782169615384
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autoscaling is critical for ensuring optimal performance and resource
utilization in cloud applications with dynamic workloads. However, traditional
autoscaling technologies are typically no longer applicable in
microservice-based applications due to the diverse workload patterns and
complex interactions between microservices. Specifically, the propagation of
performance anomalies through interactions leads to a high number of abnormal
microservices, making it difficult to identify the root performance bottlenecks
(PBs) and formulate appropriate scaling strategies. In addition, to balance
resource consumption and performance, the existing mainstream approaches based
on online optimization algorithms require multiple iterations, leading to
oscillation and elevating the likelihood of performance degradation. To tackle
these issues, we propose PBScaler, a bottleneck-aware autoscaling framework
designed to prevent performance degradation in a microservice-based
application. The key insight of PBScaler is to locate the PBs. Thus, we propose
TopoRank, a novel random walk algorithm based on the topological potential to
reduce unnecessary scaling. By integrating TopoRank with an offline
performance-aware optimization algorithm, PBScaler optimizes replica management
without disrupting the online application. Comprehensive experiments
demonstrate that PBScaler outperforms existing state-of-the-art approaches in
mitigating performance issues while conserving resources efficiently.
Related papers
- Neural Horizon Model Predictive Control -- Increasing Computational Efficiency with Neural Networks [0.0]
We propose a proposed machine-learning supported approach to model predictive control.
We propose approximating part of the problem horizon, while maintaining safety guarantees.
The proposed MPC scheme can be applied to a wide range of applications, including those requiring a rapid control response.
arXiv Detail & Related papers (2024-08-19T08:13:37Z) - Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles.
Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query.
Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z) - ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models [14.310720048047136]
ALPS is an optimization-based framework that tackles the pruning problem using the operator splitting technique and a preconditioned gradient conjugate-based post-processing step.
Our approach incorporates novel techniques to accelerate and theoretically guarantee convergence while leveraging vectorization and GPU parallelism for efficiency.
On the OPT-30B model with 70% sparsity, ALPS achieves a 13% reduction in test perplexity on the WikiText dataset and a 19% improvement in zero-shot benchmark performance compared to existing methods.
arXiv Detail & Related papers (2024-06-12T02:57:41Z) - Quantum Algorithm Exploration using Application-Oriented Performance
Benchmarks [0.0]
The QED-C suite of Application-Oriented Benchmarks provides the ability to gauge performance characteristics of quantum computers.
We investigate challenges in broadening the relevance of this benchmarking methodology to applications of greater complexity.
arXiv Detail & Related papers (2024-02-14T06:55:50Z) - Machine Learning Insides OptVerse AI Solver: Design Principles and
Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver.
We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem.
We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z) - OptScaler: A Hybrid Proactive-Reactive Framework for Robust Autoscaling
in the Cloud [11.340252931723063]
Autoscaling is a vital mechanism in cloud computing that supports the autonomous adjustment of computing resources under dynamic workloads.
Existing proactive autoscaling methods anticipate the future workload and scale the resources in advance, whereas reactive methods rely on real-time system feedback.
This paper presents OptScaler, a hybrid autoscaling framework that integrates the power of both proactive and reactive methods for regulating CPU utilization.
arXiv Detail & Related papers (2023-10-26T04:38:48Z) - Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with
Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability.
We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments.
We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - DeepScaler: Holistic Autoscaling for Microservices Based on
Spatiotemporal GNN with Adaptive Graph Learning [4.128665560397244]
This paper presents DeepScaler, a deep learning-based holistic autoscaling approach.
It focuses on coping with service dependencies to optimize service-level agreements (SLA) assurance and cost efficiency.
Experimental results demonstrate that our method implements a more effective autoscaling mechanism for microservice.
arXiv Detail & Related papers (2023-09-02T08:22:21Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - JUMBO: Scalable Multi-task Bayesian Optimization using Offline Data [86.8949732640035]
We propose JUMBO, an MBO algorithm that sidesteps limitations by querying additional data.
We show that it achieves no-regret under conditions analogous to GP-UCB.
Empirically, we demonstrate significant performance improvements over existing approaches on two real-world optimization problems.
arXiv Detail & Related papers (2021-06-02T05:03:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.