Performance Modeling of Metric-Based Serverless Computing Platforms
- URL: http://arxiv.org/abs/2202.11247v1
- Date: Wed, 23 Feb 2022 00:39:01 GMT
- Title: Performance Modeling of Metric-Based Serverless Computing Platforms
- Authors: Nima Mahmoudi, Hamzeh Khazaei
- Abstract summary: The proposed performance model can help developers and providers predict the performance and cost of deployments with different configurations.
We validate the applicability and accuracy of the proposed performance model by extensive real-world experimentation on Knative.
- Score: 5.089110111757978
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Analytical performance models are very effective in ensuring the quality of
service and cost of service deployment remain desirable under different
conditions and workloads. While various analytical performance models have been
proposed for previous paradigms in cloud computing, serverless computing lacks
such models that can provide developers with performance guarantees. Besides,
most serverless computing platforms still require developers' input to specify
the configuration for their deployment that could affect both the performance
and cost of their deployment, without providing them with any direct and
immediate feedback. In previous studies, we built such performance models for
steady-state and transient analysis of scale-per-request serverless computing
platforms (e.g., AWS Lambda, Azure Functions, Google Cloud Functions) that
could give developers immediate feedback about the quality of service and cost
of their deployments. In this work, we aim to develop analytical performance
models for the latest trend in serverless computing platforms that use
concurrency value and the rate of requests per second for autoscaling
decisions. Examples of such serverless computing platforms are Knative and
Google Cloud Run (a managed Knative service by Google). The proposed
performance model can help developers and providers predict the performance and
cost of deployments with different configurations which could help them tune
the configuration toward the best outcome. We validate the applicability and
accuracy of the proposed performance model by extensive real-world
experimentation on Knative and show that our performance model is able to
accurately predict the steady-state characteristics of a given workload with
minimal amount of data collection.
Related papers
- Tracing Optimization for Performance Modeling and Regression Detection [15.99435412859094]
A performance model analytically describes the relationship between the performance of a system and its runtime activities.
We propose statistical approaches to reduce tracing overhead by identifying and excluding performance-insensitive code regions.
Our approach is fully automated, making it ready to be used in production environments with minimal human effort.
arXiv Detail & Related papers (2024-11-26T16:11:55Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow.
SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns.
We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Predicting Resource Consumption of Kubernetes Container Systems using
Resource Models [3.138731415322007]
This paper considers how to derive resource models for cloud systems empirically.
We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources.
We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
arXiv Detail & Related papers (2023-05-12T17:59:01Z) - Cheaply Evaluating Inference Efficiency Metrics for Autoregressive
Transformer APIs [66.30706841821123]
Large language models (LLMs) power many state-of-the-art systems in natural language processing.
LLMs are extremely computationally expensive, even at inference time.
We propose a new metric for comparing inference efficiency across models.
arXiv Detail & Related papers (2023-05-03T21:51:42Z) - Measuring the Driving Forces of Predictive Performance: Application to
Credit Scoring [0.0]
In credit scoring, machine learning models are known to outperform standard parametric models.
We introduce the XPER methodology to decompose a performance metric into contributions associated with a model.
We show that a small number of features can explain a surprisingly large part of the model performance.
arXiv Detail & Related papers (2022-12-12T13:09:46Z) - Plex: Towards Reliability using Pretrained Large Model Extensions [69.13326436826227]
We develop ViT-Plex and T5-Plex, pretrained large model extensions for vision and language modalities, respectively.
Plex greatly improves the state-of-the-art across reliability tasks, and simplifies the traditional protocol.
We demonstrate scaling effects over model sizes up to 1B parameters and pretraining dataset sizes up to 4B examples.
arXiv Detail & Related papers (2022-07-15T11:39:37Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Serverless Model Serving for Data Science [23.05534539170047]
We study the viability of serverless as a mainstream model serving platform for data science applications.
We find that serverless outperforms many cloud-based alternatives with respect to cost and performance.
We present several practical recommendations for data scientists on how to use serverless for scalable and cost-effective model serving.
arXiv Detail & Related papers (2021-03-04T11:23:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.