Towards MLOps: A DevOps Tools Recommender System for Machine Learning
System
- URL: http://arxiv.org/abs/2402.12867v1
- Date: Tue, 20 Feb 2024 09:57:49 GMT
- Title: Towards MLOps: A DevOps Tools Recommender System for Machine Learning
System
- Authors: Pir Sami Ullah Shah, Naveed Ahmad, Mirza Omer Beg
- Abstract summary: MLOps and machine learning systems evolve on new data unlike traditional systems on requirements.
In this paper, we present a framework for recommendation system that processes the contextual information.
Four different approaches i.e., rule-based, random forest, decision trees and k-nearest neighbors were investigated.
- Score: 1.065497990128313
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Applying DevOps practices to machine learning system is termed as MLOps and
machine learning systems evolve on new data unlike traditional systems on
requirements. The objective of MLOps is to establish a connection between
different open-source tools to construct a pipeline that can automatically
perform steps to construct a dataset, train the machine learning model and
deploy the model to the production as well as store different versions of model
and dataset. Benefits of MLOps is to make sure the fast delivery of the new
trained models to the production to have accurate results. Furthermore, MLOps
practice impacts the overall quality of the software products and is completely
dependent on open-source tools and selection of relevant open-source tools is
considered as challenged while a generalized method to select an appropriate
open-source tools is desirable. In this paper, we present a framework for
recommendation system that processes the contextual information (e.g., nature
of data, type of the data) of the machine learning project and recommends a
relevant toolchain (tech-stack) for the operationalization of machine learning
systems. To check the applicability of the proposed framework, four different
approaches i.e., rule-based, random forest, decision trees and k-nearest
neighbors were investigated where precision, recall and f-score is measured,
the random forest out classed other approaches with highest f-score value of
0.66.
Related papers
- Extraction of Research Objectives, Machine Learning Model Names, and Dataset Names from Academic Papers and Analysis of Their Interrelationships Using LLM and Network Analysis [0.0]
This study proposes a methodology extracting tasks, machine learning methods, and dataset names from scientific papers.
The proposed method's expression extraction performance, when using Llama3, achieves an F-score exceeding 0.8 across various categories.
Benchmarking results on financial domain papers have demonstrated the effectiveness of this method.
arXiv Detail & Related papers (2024-08-22T03:10:52Z) - Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning [5.565764053895849]
Article introduces the importance of machine learning in real-world applications and explores the rise of MLOps (Machine Learning Operations)
By reviewing the evolution of MLOps and its relationship to traditional software development methods, the paper proposes ways to integrate the system into machine learning to solve the problems faced by existing MLOps and improve productivity.
arXiv Detail & Related papers (2024-05-16T05:36:28Z) - AutoML-GPT: Large Language Model for AutoML [5.9145212342776805]
We have established a framework called AutoML-GPT that integrates a comprehensive set of tools and libraries.
Through a conversational interface, users can specify their requirements, constraints, and evaluation metrics.
We have demonstrated that AutoML-GPT significantly reduces the time and effort required for machine learning tasks.
arXiv Detail & Related papers (2023-09-03T09:39:49Z) - TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations.
We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines.
This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z) - Enabling Un-/Semi-Supervised Machine Learning for MDSE of the Real-World
CPS/IoT Applications [0.5156484100374059]
We propose a novel approach to support domain-specific Model-Driven Software Engineering (MDSE) for the real-world use-case scenarios of smart Cyber-Physical Systems (CPS) and the Internet of Things (IoT)
We argue that the majority of available data in the nature for Artificial Intelligence (AI) are unlabeled. Hence, unsupervised and/or semi-supervised ML approaches are the practical choices.
Our proposed approach is fully implemented and integrated with an existing state-of-the-art MDSE tool to serve the CPS/IoT domain.
arXiv Detail & Related papers (2021-07-06T15:51:39Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z) - Towards CRISP-ML(Q): A Machine Learning Process Model with Quality
Assurance Methodology [53.063411515511056]
We propose a process model for the development of machine learning applications.
The first phase combines business and data understanding as data availability oftentimes affects the feasibility of the project.
The sixth phase covers state-of-the-art approaches for monitoring and maintenance of a machine learning applications.
arXiv Detail & Related papers (2020-03-11T08:25:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.