Related papers: Shabari: Delayed Decision-Making for Faster and Efficient Serverless Functions

Shabari: Delayed Decision-Making for Faster and Efficient Serverless Functions

URL: http://arxiv.org/abs/2401.08859v2
Date: Thu, 25 Jan 2024 16:34:22 GMT
Title: Shabari: Delayed Decision-Making for Faster and Efficient Serverless Functions
Authors: Prasoon Sinha and Kostis Kaffes and Neeraja J. Yadwadkar
Abstract summary: We introduce Shabari, a resource management framework for serverless systems. Shabari makes decisions as late as possible to right-size each invocation to meet functions' performance objectives. For a range of serverless functions and inputs, Shabari reduces SLO violations by 11-73%.
Score: 0.30693357740321775
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Serverless computing relieves developers from the burden of resource management, thus providing ease-of-use to the users and the opportunity to optimize resource utilization for the providers. However, today's serverless systems lack performance guarantees for function invocations, thus limiting support for performance-critical applications: we observed severe performance variability (up to 6x). Providers lack visibility into user functions and hence find it challenging to right-size them: we observed heavy resource underutilization (up to 80%). To understand the causes behind the performance variability and underutilization, we conducted a measurement study of commonly deployed serverless functions and learned that the function performance and resource utilization depend crucially on function semantics and inputs. Our key insight is to delay making resource allocation decisions until after the function inputs are available. We introduce Shabari, a resource management framework for serverless systems that makes decisions as late as possible to right-size each invocation to meet functions' performance objectives (SLOs) and improve resource utilization. Shabari uses an online learning agent to right-size each function invocation based on the features of the function input and makes cold-start-aware scheduling decisions. For a range of serverless functions and inputs, Shabari reduces SLO violations by 11-73% while not wasting any vCPUs and reducing wasted memory by 64-94% in the median case, compared to state-of-the-art systems, including Aquatope, Parrotfish, and Cypress.

Related papers

Network Resource Optimization for ML-Based UAV Condition Monitoring with Vibration Analysis [54.550658461477106]
Condition Monitoring (CM) uses Machine Learning (ML) models to identify abnormal and adverse conditions. This work explores the optimization of network resources for ML-based UAV CM frameworks. By leveraging dimensionality reduction techniques, there is a 99.9% reduction in network resource consumption.
arXiv Detail & Related papers (2025-02-21T14:36:12Z)
Input-Based Ensemble-Learning Method for Dynamic Memory Configuration of Serverless Computing Functions [18.36339203254509]
We present MemFigLess, a serverless solution that estimates the memory requirement of a serverless function with input-awareness. MemFigLess is able to capture the input-aware resource relationships and allocate upto 82% less resources and save up to 87% run-time costs.
arXiv Detail & Related papers (2024-11-12T00:03:11Z)
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow. SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns. We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z)
SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions [31.01399126339857]
Serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. Existing solutions tend to use over-simplistic strategies for function pre-loading/unloading without full invocation pattern exploitation. We propose SPES, the first differentiated scheduler for runtime cold start mitigation by optimizing serverless function provision.
arXiv Detail & Related papers (2024-03-26T10:28:41Z)
Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability. We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments. We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z)
On-demand Cold Start Frequency Reduction with Off-Policy Reinforcement Learning in Serverless Computing [18.36339203254509]
The presented work focuses on reducing the frequent, on-demand cold starts on the platform by using Reinforcement Learning(RL) The proposed approach uses model-free Q-learning that consider function metrics such as CPU utilization, existing function instances, and response failure rate, to proactively initialize functions, in advance. The evaluation results demonstrate a favourable performance of the RL-based agent when compared to Kubeless' default policy and a function keep-alive policy.
arXiv Detail & Related papers (2023-08-15T03:01:41Z)
Non-Convex Bilevel Optimization with Time-Varying Objective Functions [57.299128109226025]
We propose an online bilevel optimization where the functions can be time-varying and the agent continuously updates the decisions with online data. Compared to existing algorithms, SOBOW is computationally efficient and does not need to know previous functions. We show that SOBOW can achieve a sublinear bilevel local regret under mild conditions.
arXiv Detail & Related papers (2023-08-07T06:27:57Z)
SCOPE: Performance Testing for Serverless Computing [16.9571718076286]
We propose SCOPE, the first serverless computing-oriented performance testing approach. SCOPE provides testing results with 97.25% accuracy, 33.83 percentage points higher than the best currently available technique.
arXiv Detail & Related papers (2023-06-02T15:29:28Z)
Learning to Optimize Permutation Flow Shop Scheduling via Graph-based Imitation Learning [70.65666982566655]
Permutation flow shop scheduling (PFSS) is widely used in manufacturing systems. We propose to train the model via expert-driven imitation learning, which accelerates convergence more stably and accurately. Our model's network parameters are reduced to only 37% of theirs, and the solution gap of our model towards the expert solutions decreases from 6.8% to 1.3% on average.
arXiv Detail & Related papers (2022-10-31T09:46:26Z)
Gradient and Projection Free Distributed Online Min-Max Resource Optimization [26.681658600897688]
We consider distributed online min-max resource allocation with a set of parallel agents. We propose a novel online strategy called Distributed Online resource Re-Alggler (DORA) DORA does not require calculation or projection operation, unlike most existing online strategies.
arXiv Detail & Related papers (2021-12-07T18:42:07Z)
Harvesting Idle Resources in Serverless Computing via Reinforcement Learning [7.346628578439277]
FRM maximizes resource efficiency by dynamically harvesting idle resources from functions over-supplied to functions under-supplied. FRM monitors each function's resource utilization in real-time, detects over-provisioning and under-provisioning, and applies deep reinforcement learning to harvest idle resources safely. We have implemented and deployed a FRM prototype in a 13-node Apache OpenWhisk cluster.
arXiv Detail & Related papers (2021-08-28T23:02:56Z)
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning [61.29990368322931]
Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors. Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers.
arXiv Detail & Related papers (2020-08-27T16:56:48Z)
Dynamic Parameter Allocation in Parameter Servers [74.250687861348]
We propose to integrate dynamic parameter allocation into parameter servers, describe an efficient implementation of such a parameter server called Lapse. We found that Lapse provides near-linear scaling and can be orders of magnitude faster than existing parameter servers.
arXiv Detail & Related papers (2020-02-03T11:37:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.