SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions
- URL: http://arxiv.org/abs/2403.17574v1
- Date: Tue, 26 Mar 2024 10:28:41 GMT
- Title: SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions
- Authors: Cheryl Lee, Zhouruixin Zhu, Tianyi Yang, Yintong Huo, Yuxin Su, Pinjia He, Michael R. Lyu,
- Abstract summary: Serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources.
Existing solutions tend to use over-simplistic strategies for function pre-loading/unloading without full invocation pattern exploitation.
We propose SPES, the first differentiated scheduler for runtime cold start mitigation by optimizing serverless function provision.
- Score: 31.01399126339857
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: As an emerging cloud computing deployment paradigm, serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. However, a significant hurdle remains in the form of the cold start problem, causing latency when launching new function instances from scratch. Existing solutions tend to use over-simplistic strategies for function pre-loading/unloading without full invocation pattern exploitation, rendering unsatisfactory optimization of the trade-off between cold start latency and resource waste. To bridge this gap, we propose SPES, the first differentiated scheduler for runtime cold start mitigation by optimizing serverless function provision. Our insight is that the common architecture of serverless systems prompts the con- centration of certain invocation patterns, leading to predictable invocation behaviors. This allows us to categorize functions and pre-load/unload proper function instances with finer-grained strategies based on accurate invocation prediction. Experiments demonstrate the success of SPES in optimizing serverless function provision on both sides: reducing the 75th-percentile cold start rates by 49.77% and the wasted memory time by 56.43%, compared to the state-of-the-art. By mitigating the cold start issue, SPES is a promising advancement in facilitating cloud services deployed on serverless architectures.
Related papers
- SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models [55.608981341747246]
We introduce Data Adaptive Self-Supervised Early Exit (DAISY), an approach that decides when to exit based on the self-supervised loss.
Our analysis on the adaptivity of DAISY shows that the model exits early (using fewer layers) on clean data while exits late (using more layers) on noisy data.
arXiv Detail & Related papers (2024-06-08T12:58:13Z) - Shabari: Delayed Decision-Making for Faster and Efficient Serverless
Functions [0.30693357740321775]
We introduce Shabari, a resource management framework for serverless systems.
Shabari makes decisions as late as possible to right-size each invocation to meet functions' performance objectives.
For a range of serverless functions and inputs, Shabari reduces SLO violations by 11-73%.
arXiv Detail & Related papers (2024-01-16T22:20:36Z) - Federated Learning of Large Language Models with Parameter-Efficient
Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data.
The training process of Large Language Models (LLMs) generally incurs the update of significant parameters.
This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z) - Latency-aware Unified Dynamic Networks for Efficient Image Recognition [72.8951331472913]
LAUDNet is a framework to bridge the theoretical and practical efficiency gap in dynamic networks.
It integrates three primary dynamic paradigms-spatially adaptive computation, dynamic layer skipping, and dynamic channel skipping.
It can notably reduce the latency of models like ResNet by over 50% on platforms such as V100,3090, and TX2 GPUs.
arXiv Detail & Related papers (2023-08-30T10:57:41Z) - Reinforcement Learning (RL) Augmented Cold Start Frequency Reduction in
Serverless Computing [20.638612359627952]
This work focuses on reducing the frequency of cold starts on the Function-as-a-Service platform by using Reinforcement Learning.
The proposed solution was implemented on Kubeless and was evaluated using a normalised real-world function demand trace with matrix multiplication as the workload.
arXiv Detail & Related papers (2023-08-15T03:01:41Z) - Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models [64.49254199311137]
We propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models.
The essence of IDPT is to develop a dynamic prompt generation module to perceive semantic prior features of each point cloud instance.
In experiments, IDPT outperforms full fine-tuning in most tasks with a mere 7% of the trainable parameters.
arXiv Detail & Related papers (2023-04-14T16:03:09Z) - DeF-DReL: Systematic Deployment of Serverless Functions in Fog and Cloud
environments using Deep Reinforcement Learning [8.204696165200577]
Fog environment made its limited resource available to a large number of users to deploy their serverless applications.
Recent research mainly focuses on assigning maximum resources to such applications from the fog node and not taking full advantage of the cloud environment.
We propose DeF-DReL, a Systematic Deployment of Serverless Functions in Fog and Cloud environments using Deep Reinforcement Learning.
arXiv Detail & Related papers (2021-10-29T12:10:54Z) - Harvesting Idle Resources in Serverless Computing via Reinforcement
Learning [7.346628578439277]
FRM maximizes resource efficiency by dynamically harvesting idle resources from functions over-supplied to functions under-supplied.
FRM monitors each function's resource utilization in real-time, detects over-provisioning and under-provisioning, and applies deep reinforcement learning to harvest idle resources safely.
We have implemented and deployed a FRM prototype in a 13-node Apache OpenWhisk cluster.
arXiv Detail & Related papers (2021-08-28T23:02:56Z) - Optimal Resource Allocation for Serverless Queries [8.59568779761598]
Prior work focused on predicting peak allocation while ignoring aggressive trade-offs between resource allocation and run-time.
We introduce a system for optimal resource allocation that can predict performance with aggressive trade-offs, for both new and past observed queries.
arXiv Detail & Related papers (2021-07-19T02:55:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.