Reusable MLOps: Reusable Deployment, Reusable Infrastructure and
Hot-Swappable Machine Learning models and services
- URL: http://arxiv.org/abs/2403.00787v1
- Date: Mon, 19 Feb 2024 23:40:46 GMT
- Title: Reusable MLOps: Reusable Deployment, Reusable Infrastructure and
Hot-Swappable Machine Learning models and services
- Authors: D Panchal, P Verma, I Baran, D Musgrove, D Lu
- Abstract summary: We introduce a new sustainable concept in the field of AI/ML operations - called Reusable MLOps.
We reuse the existing deployment and infrastructure to serve new models by hot-swapping them without tearing down the infrastructure or the microservice.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although Machine Learning model building has become increasingly accessible
due to a plethora of tools, libraries and algorithms being available freely,
easy operationalization of these models is still a problem. It requires
considerable expertise in data engineering, software development, cloud and
DevOps. It also requires planning, agreement, and vision of how the model is
going to be used by the business applications once it is in production, how it
is going to be continuously trained on fresh incoming data, and how and when a
newer model would replace an existing model. This leads to developers and data
scientists working in silos and making suboptimal decisions. It also leads to
wasted time and effort. We introduce the Acumos AI platform we developed and we
demonstrate some unique novel capabilities that the Acumos model runner
possesses, that can help solve the above problems. We introduce a new
sustainable concept in the field of AI/ML operations - called Reusable MLOps -
where we reuse the existing deployment and infrastructure to serve new models
by hot-swapping them without tearing down the infrastructure or the
microservice, thus achieving reusable deployment and operations for AI/ML
models while still having continuously trained models in production.
Related papers
- Towards Synthetic Trace Generation of Modeling Operations using In-Context Learning Approach [1.8874331450711404]
We propose a conceptual framework that combines modeling event logs, intelligent modeling assistants, and the generation of modeling operations.
In particular, the architecture comprises modeling components that help the designer specify the system, record its operation within a graphical modeling environment, and automatically recommend relevant operations.
arXiv Detail & Related papers (2024-08-26T13:26:44Z) - Model Callers for Transforming Predictive and Generative AI Applications [2.7195102129095003]
We introduce a novel software abstraction termed "model caller"
Model callers act as an intermediary for AI and ML model calling.
We have released a prototype Python library for model callers, accessible for installation via pip or for download from GitHub.
arXiv Detail & Related papers (2024-04-17T12:21:06Z) - Model-as-a-Service (MaaS): A Survey [42.70857461774014]
Foundation models are a form of generative artificial intelligence (GenAI)
Model-as-a-Service (M) has emerged as a groundbreaking paradigm that revolutionizes the deployment and utilization of GenAI models.
arXiv Detail & Related papers (2023-11-10T00:35:00Z) - Model Share AI: An Integrated Toolkit for Collaborative Machine Learning
Model Development, Provenance Tracking, and Deployment in Python [0.0]
We introduce Model Share AI (AIMS), an easy-to-use MLOps platform designed to streamline collaborative model development, model provenance tracking, and model deployment.
AIMS features collaborative project spaces and a standardized model evaluation process that ranks model submissions based on their performance on unseen evaluation data.
AIMS allows users to deploy ML models built in Scikit-Learn, Keras, PyTorch, and ONNX into live REST APIs and automatically generated web apps.
arXiv Detail & Related papers (2023-09-27T15:24:39Z) - Predicting Resource Consumption of Kubernetes Container Systems using
Resource Models [3.138731415322007]
This paper considers how to derive resource models for cloud systems empirically.
We do so based on models of deployed services in a formal language with explicit adherence to CPU and memory resources.
We report on leveraging data collected empirically from small deployments to simulate the execution of higher intensity scenarios on larger deployments.
arXiv Detail & Related papers (2023-05-12T17:59:01Z) - OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques.
We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z) - Model Ratatouille: Recycling Diverse Models for Out-of-Distribution
Generalization [99.6826401545377]
Foundation models are redefining how AI systems are built. Practitioners now follow a standard procedure to build their machine learning solutions.
We propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks.
arXiv Detail & Related papers (2022-12-20T17:21:46Z) - Enabling Automated Machine Learning for Model-Driven AI Engineering [60.09869520679979]
We propose a novel approach to enable Model-Driven Software Engineering and Model-Driven AI Engineering.
In particular, we support Automated ML, thus assisting software engineers without deep AI knowledge in developing AI-intensive systems.
arXiv Detail & Related papers (2022-03-06T10:12:56Z) - Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models.
Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning.
Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z) - INTERN: A New Learning Paradigm Towards General Vision [117.3343347061931]
We develop a new learning paradigm named INTERN.
By learning with supervisory signals from multiple sources in multiple stages, the model being trained will develop strong generalizability.
In most cases, our models, adapted with only 10% of the training data in the target domain, outperform the counterparts trained with the full set of data.
arXiv Detail & Related papers (2021-11-16T18:42:50Z) - Model Reuse with Reduced Kernel Mean Embedding Specification [70.044322798187]
We present a two-phase framework for finding helpful models for a current application.
In the upload phase, when a model is uploading into the pool, we construct a reduced kernel mean embedding (RKME) as a specification for the model.
Then in the deployment phase, the relatedness of the current task and pre-trained models will be measured based on the value of the RKME specification.
arXiv Detail & Related papers (2020-01-20T15:15:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.