Related papers: Performance Modeling of Metric-Based Serverless Computing Platforms

Performance Modeling of Metric-Based Serverless Computing Platforms

URL: http://arxiv.org/abs/2202.11247v1
Date: Wed, 23 Feb 2022 00:39:01 GMT
Title: Performance Modeling of Metric-Based Serverless Computing Platforms
Authors: Nima Mahmoudi, Hamzeh Khazaei
Abstract summary: The proposed performance model can help developers and providers predict the performance and cost of deployments with different configurations. We validate the applicability and accuracy of the proposed performance model by extensive real-world experimentation on Knative.
Score: 5.089110111757978
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Analytical performance models are very effective in ensuring the quality of service and cost of service deployment remain desirable under different conditions and workloads. While various analytical performance models have been proposed for previous paradigms in cloud computing, serverless computing lacks such models that can provide developers with performance guarantees. Besides, most serverless computing platforms still require developers' input to specify the configuration for their deployment that could affect both the performance and cost of their deployment, without providing them with any direct and immediate feedback. In previous studies, we built such performance models for steady-state and transient analysis of scale-per-request serverless computing platforms (e.g., AWS Lambda, Azure Functions, Google Cloud Functions) that could give developers immediate feedback about the quality of service and cost of their deployments. In this work, we aim to develop analytical performance models for the latest trend in serverless computing platforms that use concurrency value and the rate of requests per second for autoscaling decisions. Examples of such serverless computing platforms are Knative and Google Cloud Run (a managed Knative service by Google). The proposed performance model can help developers and providers predict the performance and cost of deployments with different configurations which could help them tune the configuration toward the best outcome. We validate the applicability and accuracy of the proposed performance model by extensive real-world experimentation on Knative and show that our performance model is able to accurately predict the steady-state characteristics of a given workload with minimal amount of data collection.

Related papers

Tracing Optimization for Performance Modeling and Regression Detection [15.99435412859094]
A performance model analytically describes the relationship between the performance of a system and its runtime activities. We propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions. Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
arXiv Detail & Related papers (2024-11-26T16:11:55Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow. SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns. We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z)
Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development [67.55944651679864]
We present a new sandbox suite tailored for integrated data-model co-development. This sandbox provides a feedback-driven experimental platform, enabling cost-effective and guided refinement of both data and models.
arXiv Detail & Related papers (2024-07-16T14:40:07Z)
QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement. QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights. We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z)
SCOPE: Performance Testing for Serverless Computing [16.9571718076286]
We propose SCOPE, the first serverless computing-oriented performance testing approach. SCOPE provides testing results with 97.25% accuracy, 33.83 percentage points higher than the best currently available technique.
arXiv Detail & Related papers (2023-06-02T15:29:28Z)
Predicting Resource Consumption of Kubernetes Container Systems using Resource Models [3.138731415322007]
This paper considers how to derive resource models for cloud systems empirically. We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources. We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
arXiv Detail & Related papers (2023-05-12T17:59:01Z)
Cheaply Evaluating Inference Efficiency Metrics for Autoregressive Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing. LLMs are extremely computationally expensive, even at inference time. We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z)
Measuring the Driving Forces of Predictive Performance: Application to Credit Scoring [0.0]
In credit scoring, machine learning models are known to outperform standard parametric models. We introduce the XPER methodology to decompose a performance metric into contributions associated with a model. We show that a small number of features can explain a surprisingly large part of the model performance.
arXiv Detail & Related papers (2022-12-12T13:09:46Z)
Plex: Towards Reliability using Pretrained Large Model Extensions [69.13326436826227]
We develop ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively. Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol. We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples.
arXiv Detail & Related papers (2022-07-15T11:39:37Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Serverless Model Serving for Data Science [23.05534539170047]
We study the viability of serverless as a mainstream model serving platform for data science applications. We find that serverless outperforms many cloud-based alternatives with respect to cost and performance. We present several practical recommendations for data scientists on how to use serverless for scalable and cost-effective model serving.
arXiv Detail & Related papers (2021-03-04T11:23:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.