Related papers: Deployment of ML Models using Kubeflow on Different Cloud Providers

Deployment of ML Models using Kubeflow on Different Cloud Providers

URL: http://arxiv.org/abs/2206.13655v1
Date: Mon, 27 Jun 2022 22:46:11 GMT
Title: Deployment of ML Models using Kubeflow on Different Cloud Providers
Authors: Aditya Pandey, Maitreya Sonawane, Sumit Mamtani
Abstract summary: We create end-to-end Machine Learning models on Kubeflow in the form of pipelines. We analyze various points including the ease of setup, deployment models, performance, limitations and features of the tool.
Score: 0.17205106391379021
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This project aims to explore the process of deploying Machine learning models on Kubernetes using an open-source tool called Kubeflow [1] - an end-to-end ML Stack orchestration toolkit. We create end-to-end Machine Learning models on Kubeflow in the form of pipelines and analyze various points including the ease of setup, deployment models, performance, limitations and features of the tool. We hope that our project acts almost like a seminar/introductory report that can help vanilla cloud/Kubernetes users with zero knowledge on Kubeflow use Kubeflow to deploy ML models. From setup on different clouds to serving our trained model over the internet - we give details and metrics detailing the performance of Kubeflow.

Related papers

Visualizing Cloud-native Applications with KubeDiagrams [0.0]
KubeDiagrams is an open-source tool that transforms manifests into architecture diagrams.<n>We show how KubeDiagrams enhances system comprehension and supports architectural reasoning in distributed cloud-native systems.
arXiv Detail & Related papers (2025-05-28T21:27:25Z)
WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models [105.46456444315693]
We presentLLM, a data-centric framework to enhance the capability of large language models in workflow orchestration. It first constructs a large-scale fine-tuningBench with 106,763 samples, covering 1,503 APIs from 83 applications across 28 categories. LlamaLlama demonstrates a strong capacity to orchestrate complex APIs, while also achieving notable generalization performance.
arXiv Detail & Related papers (2024-11-08T09:58:02Z)
SeBS-Flow: Benchmarking Serverless Cloud Function Workflows [51.4200085836966]
We propose the first serverless workflow benchmarking suite SeBS-Flow. SeBS-Flow includes six real-world application benchmarks and four microbenchmarks representing different computational patterns. We conduct comprehensive evaluations on three major cloud platforms, assessing performance, cost, scalability, and runtime deviations.
arXiv Detail & Related papers (2024-10-04T14:52:18Z)
VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models [89.63342806812413]
We present an open-source toolkit for evaluating large multi-modality models based on PyTorch. VLMEvalKit implements over 70 different large multi-modality models, including both proprietary APIs and open-source models. We host OpenVLM Leaderboard to track the progress of multi-modality learning research.
arXiv Detail & Related papers (2024-07-16T13:06:15Z)
Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user. To scale up the scope of the tools, we next propose a black-box probing method. For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z)
Couler: Unified Machine Learning Workflow Optimization in Cloud [6.769259207650922]
Couler is a system designed for unified ML workflow optimization in the cloud. We integrate Large Language Models (LLMs) into workflow generation, and provide a unified programming interface for various workflow engines. Couer has successfully improved the CPU/Memory utilization by more than 15% and the workflow completion rate by around 17%.
arXiv Detail & Related papers (2024-03-12T12:47:32Z)
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models [51.35570730554632]
ESPnet-SPK is a toolkit for training speaker embedding extractors. We provide several models, ranging from x-vector to recent SKA-TDNN. We also aspire to bridge developed models with other domains.
arXiv Detail & Related papers (2024-01-30T18:18:27Z)
In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z)
MLOps: A Step Forward to Enterprise Machine Learning [0.0]
This research presents a detailed review of MLOps, its benefits, difficulties, evolutions, and important underlying technologies. The MLOps workflow is explained in detail along with the various tools necessary for both model and data exploration and deployment. This article also puts light on the end-to-end production of ML projects using various maturity levels of automated pipelines.
arXiv Detail & Related papers (2023-05-27T20:44:14Z)
Predicting Resource Consumption of Kubernetes Container Systems using Resource Models [3.138731415322007]
This paper considers how to derive resource models for cloud systems empirically. We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources. We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
arXiv Detail & Related papers (2023-05-12T17:59:01Z)
Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps [85.49020931411825]
Convolutional Neural Networks (CNNs) compression is crucial to deploying these models in edge devices with limited resources. We propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process. We tackle this challenge by introducing a selector model that predicts real-time smooth saliency masks for pruned models.
arXiv Detail & Related papers (2022-09-07T01:12:11Z)
MLModelCI: An Automatic Cloud Platform for Efficient MLaaS [15.029094196394862]
We release the platform as an open-source project on GitHub under Apache 2.0 license. Our system bridges the gap between current ML training and serving systems and thus free developers from manual and tedious work often associated with service deployment.
arXiv Detail & Related papers (2020-06-09T07:48:20Z)
ContainerStress: Autonomous Cloud-Node Scoping Framework for Big-Data ML Use Cases [0.2752817022620644]
OracleLabs has developed an automated framework that uses nested-loop Monte Carlo simulation to autonomously scale any size customer ML use cases. OracleLabs and NVIDIA authors have collaborated on a ML benchmark study which analyzes the compute cost and GPU acceleration of any ML prognostic algorithm.
arXiv Detail & Related papers (2020-03-18T01:51:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.