Predicting Resource Consumption of Kubernetes Container Systems using
Resource Models
- URL: http://arxiv.org/abs/2305.07651v1
- Date: Fri, 12 May 2023 17:59:01 GMT
- Title: Predicting Resource Consumption of Kubernetes Container Systems using
Resource Models
- Authors: Gianluca Turin, Andrea Borgarelli, Simone Donetti, Ferruccio Damiani,
Einar Broch Johnsen, Silvia Lizeth Tapia Tarifa
- Abstract summary: This paper considers how to derive resource models for cloud systems empirically.
We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources.
We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
- Score: 3.138731415322007
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cloud computing has radically changed the way organisations operate their
software by allowing them to achieve high availability of services at
affordable cost. Containerized microservices is an enabling technology for this
change, and advanced container orchestration platforms such as Kubernetes are
used for service management. Despite the flourishing ecosystem of monitoring
tools for such orchestration platforms, service management is still mainly a
manual effort.
The modeling of cloud computing systems is an essential step towards
automatic management, but the modeling of cloud systems of such complexity
remains challenging and, as yet, unaddressed. In fact modeling resource
consumption will be a key to comparing the outcome of possible deployment
scenarios. This paper considers how to derive resource models for cloud systems
empirically. We do so based on models of deployed services in a formal modeling
language with explicit CPU and memory resources; once the adherence to the real
system is good enough, formal properties can be verified in the model.
Targeting a likely microservices application, we present a model of
Kubernetes developed in Real-Time ABS. We report on leveraging data collected
empirically from small deployments to simulate the execution of higher
intensity scenarios on larger deployments. We discuss the challenges and
limitations that arise from this approach, and identify constraints under which
we obtain satisfactory accuracy.
Related papers
- Adaptable Embeddings Network (AEN) [49.1574468325115]
We introduce Adaptable Embeddings Networks (AEN), a novel dual-encoder architecture using Kernel Density Estimation (KDE)
AEN allows for runtime adaptation of classification criteria without retraining and is non-autoregressive.
The architecture's ability to preprocess and cache condition embeddings makes it ideal for edge computing applications and real-time monitoring systems.
arXiv Detail & Related papers (2024-11-21T02:15:52Z) - Reusable MLOps: Reusable Deployment, Reusable Infrastructure and
Hot-Swappable Machine Learning models and services [0.0]
We introduce a new sustainable concept in the field of AI/ML operations - called Reusable MLOps.
We reuse the existing deployment and infrastructure to serve new models by hot-swapping them without tearing down the infrastructure or the microservice.
arXiv Detail & Related papers (2024-02-19T23:40:46Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Continual-Learning-as-a-Service (CLaaS): On-Demand Efficient Adaptation
of Predictive Models [17.83007940710455]
Two main future trends for companies that want to build machine learning-based applications are real-time inference and continual updating.
This paper defines a novel software service and model delivery infrastructure termed Continual Learning-as-a-Service (CL) to address these issues.
It provides support for model updating and validation tools for data scientists without an on-premise solution and in an efficient, stateful and easy-to-use manner.
arXiv Detail & Related papers (2022-06-14T16:22:54Z) - Real-time Neural-MPC: Deep Learning Model Predictive Control for
Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline.
We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Power Modeling for Effective Datacenter Planning and Compute Management [53.41102502425513]
We discuss two classes of statistical power models designed and validated to be accurate, simple, interpretable and applicable to all hardware configurations and workloads.
We demonstrate that the proposed statistical modeling techniques, while simple and scalable, predict power with less than 5% Mean Absolute Percent Error (MAPE) for more than 95% diverse Power Distribution Units (more than 2000) using only 4 features.
arXiv Detail & Related papers (2021-03-22T21:22:51Z) - Scalable Deployment of AI Time-series Models for IoT [0.7169734491710924]
IBM Research Castor is a cloud-native system for managing and deploying time-series models in IoT applications.
Model templates can be deployed against specific instances of semantic concepts.
Results from deployments in real-world smartgrid live forecasting applications are reported.
arXiv Detail & Related papers (2020-03-24T14:27:25Z) - FlexServe: Deployment of PyTorch Models as Flexible REST Endpoints [6.730473762151365]
integration of artificial intelligence capabilities into modern software systems is increasingly being simplified through the use of cloud-based services and representational state transfer architecture.
Insufficient information regarding underlying model provenance and the lack of control over model evolution serve as an impediment to the more widespread adoption of these services in many operational environments which have strict security requirements.
arXiv Detail & Related papers (2020-02-29T18:51:09Z) - Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application.
In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model.
Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.