Related papers: SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions

SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions

URL: http://arxiv.org/abs/2403.17574v2
Date: Wed, 21 Aug 2024 09:06:36 GMT
Title: SPES: Towards Optimizing Performance-Resource Trade-Off for Serverless Functions
Authors: Cheryl Lee, Zhouruixing Zhu, Tianyi Yang, Yintong Huo, Yuxin Su, Pinjia He, Michael R. Lyu,
Abstract summary: Serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. Existing solutions tend to use over-simplistic strategies for function pre-loading/unloading without full invocation pattern exploitation. We propose SPES, the first differentiated scheduler for runtime cold start mitigation by optimizing serverless function provision.
Score: 31.01399126339857
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: As an emerging cloud computing deployment paradigm, serverless computing is gaining traction due to its efficiency and ability to harness on-demand cloud resources. However, a significant hurdle remains in the form of the cold start problem, causing latency when launching new function instances from scratch. Existing solutions tend to use over-simplistic strategies for function pre-loading/unloading without full invocation pattern exploitation, rendering unsatisfactory optimization of the trade-off between cold start latency and resource waste. To bridge this gap, we propose SPES, the first differentiated scheduler for runtime cold start mitigation by optimizing serverless function provision. Our insight is that the common architecture of serverless systems prompts the concentration of certain invocation patterns, leading to predictable invocation behaviors. This allows us to categorize functions and pre-load/unload proper function instances with finer-grained strategies based on accurate invocation prediction. Experiments demonstrate the success of SPES in optimizing serverless function provision on both sides: reducing the 75th-percentile cold start rates by 49.77% and the wasted memory time by 56.43%, compared to the state-of-the-art. By mitigating the cold start issue, SPES is a promising advancement in facilitating cloud services deployed on serverless architectures.

Related papers

Transformer-Based Model for Cold Start Mitigation in FaaS Architecture [44.99833362998488]
Cold start occurs when an idle F function is invoked, requiring a full-time process, which increases latency and degrades user experience. Existing solutions for cold start mitigation are limited in terms of invocation pattern generalization and implementation complexity. We propose an innovative approach leveraging Transformer models to mitigate the impact of cold starts in F architectures.
arXiv Detail & Related papers (2025-04-15T16:12:07Z)
Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference [49.77734021302196]
We propose a task-oriented feature compression (TOFC) method for multimodal understanding in a device-edge co-inference framework. To enhance compression efficiency, multiple entropy models are adaptively selected based on the characteristics of the visual features. Results show that TOFC achieves up to 60% reduction in data transmission overhead and 50% reduction in system latency.
arXiv Detail & Related papers (2025-03-17T08:37:22Z)
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow. SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns. We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z)
Forecast-PEFT: Parameter-Efficient Fine-Tuning for Pre-trained Motion Forecasting Models [68.23649978697027]
Forecast-PEFT is a fine-tuning strategy that freezes the majority of the model's parameters, focusing adjustments on newly introduced prompts and adapters. Our experiments show that Forecast-PEFT outperforms traditional full fine-tuning methods in motion prediction tasks. Forecast-FT further improves prediction performance, evidencing up to a 9.6% enhancement over conventional baseline methods.
arXiv Detail & Related papers (2024-07-28T19:18:59Z)
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios. In the early route, intermediate outputs are consolidated via an anti-redundancy operation. In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z)
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models [55.608981341747246]
We introduce Data Adaptive Self-Supervised Early Exit (DAISY), an approach that decides when to exit based on the self-supervised loss. Our analysis on the adaptivity of DAISY shows that the model exits early (using fewer layers) on clean data while exits late (using more layers) on noisy data.
arXiv Detail & Related papers (2024-06-08T12:58:13Z)
Shabari: Delayed Decision-Making for Faster and Efficient Serverless Functions [0.30693357740321775]
We introduce Shabari, a resource management framework for serverless systems. Shabari makes decisions as late as possible to right-size each invocation to meet functions' performance objectives. For a range of serverless functions and inputs, Shabari reduces SLO violations by 11-73%.
arXiv Detail & Related papers (2024-01-16T22:20:36Z)
On-demand Cold Start Frequency Reduction with Off-Policy Reinforcement Learning in Serverless Computing [18.36339203254509]
The presented work focuses on reducing the frequent, on-demand cold starts on the platform by using Reinforcement Learning(RL) The proposed approach uses model-free Q-learning that consider function metrics such as CPU utilization, existing function instances, and response failure rate, to proactively initialize functions, in advance. The evaluation results demonstrate a favourable performance of the RL-based agent when compared to Kubeless' default policy and a function keep-alive policy.
arXiv Detail & Related papers (2023-08-15T03:01:41Z)
SCOPE: Performance Testing for Serverless Computing [16.9571718076286]
We propose SCOPE, the first serverless computing-oriented performance testing approach. SCOPE provides testing results with 97.25% accuracy, 33.83 percentage points higher than the best currently available technique.
arXiv Detail & Related papers (2023-06-02T15:29:28Z)
Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models [64.49254199311137]
We propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models. The essence of IDPT is to develop a dynamic prompt generation module to perceive semantic prior features of each point cloud instance. In experiments, IDPT outperforms full fine-tuning in most tasks with a mere 7% of the trainable parameters.
arXiv Detail & Related papers (2023-04-14T16:03:09Z)
DeF-DReL: Systematic Deployment of Serverless Functions in Fog and Cloud environments using Deep Reinforcement Learning [8.204696165200577]
Fog environment made its limited resource available to a large number of users to deploy their serverless applications. Recent research mainly focuses on assigning maximum resources to such applications from the fog node and not taking full advantage of the cloud environment. We propose DeF-DReL, a Systematic Deployment of Serverless Functions in Fog and Cloud environments using Deep Reinforcement Learning.
arXiv Detail & Related papers (2021-10-29T12:10:54Z)
Harvesting Idle Resources in Serverless Computing via Reinforcement Learning [7.346628578439277]
FRM maximizes resource efficiency by dynamically harvesting idle resources from functions over-supplied to functions under-supplied. FRM monitors each function's resource utilization in real-time, detects over-provisioning and under-provisioning, and applies deep reinforcement learning to harvest idle resources safely. We have implemented and deployed a FRM prototype in a 13-node Apache OpenWhisk cluster.
arXiv Detail & Related papers (2021-08-28T23:02:56Z)
Optimal Resource Allocation for Serverless Queries [8.59568779761598]
Prior work focused on predicting peak allocation while ignoring aggressive trade-offs between resource allocation and run-time. We introduce a system for optimal resource allocation that can predict performance with aggressive trade-offs, for both new and past observed queries.
arXiv Detail & Related papers (2021-07-19T02:55:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.