Related papers: Towards MLOps: A DevOps Tools Recommender System for Machine Learning System

Towards MLOps: A DevOps Tools Recommender System for Machine Learning System

URL: http://arxiv.org/abs/2402.12867v1
Date: Tue, 20 Feb 2024 09:57:49 GMT
Title: Towards MLOps: A DevOps Tools Recommender System for Machine Learning System
Authors: Pir Sami Ullah Shah, Naveed Ahmad, Mirza Omer Beg
Abstract summary: MLOps and machine learning systems evolve on new data unlike traditional systems on requirements. In this paper, we present a framework for recommendation system that processes the contextual information. Four different approaches i.e., rule-based, random forest, decision trees and k-nearest neighbors were investigated.
Score: 1.065497990128313
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Applying DevOps practices to machine learning system is termed as MLOps and machine learning systems evolve on new data unlike traditional systems on requirements. The objective of MLOps is to establish a connection between different open-source tools to construct a pipeline that can automatically perform steps to construct a dataset, train the machine learning model and deploy the model to the production as well as store different versions of model and dataset. Benefits of MLOps is to make sure the fast delivery of the new trained models to the production to have accurate results. Furthermore, MLOps practice impacts the overall quality of the software products and is completely dependent on open-source tools and selection of relevant open-source tools is considered as challenged while a generalized method to select an appropriate open-source tools is desirable. In this paper, we present a framework for recommendation system that processes the contextual information (e.g., nature of data, type of the data) of the machine learning project and recommends a relevant toolchain (tech-stack) for the operationalization of machine learning systems. To check the applicability of the proposed framework, four different approaches i.e., rule-based, random forest, decision trees and k-nearest neighbors were investigated where precision, recall and f-score is measured, the random forest out classed other approaches with highest f-score value of 0.66.

Related papers

Data Requirement Goal Modeling for Machine Learning Systems [0.8854624631197942]
This work proposes an approach to guide non-experts in identifying data requirements for Machine Learning systems. We first develop the Data Requirement Goal Model (DRGM) by surveying the white literature. We then validate the approach through two illustrative examples based on real-world projects.
arXiv Detail & Related papers (2025-04-10T11:30:25Z)
Automating the Training and Deployment of Models in MLOps by Integrating Systems with Machine Learning [5.565764053895849]
Article introduces the importance of machine learning in real-world applications and explores the rise of MLOps (Machine Learning Operations) By reviewing the evolution of MLOps and its relationship to traditional software development methods, the paper proposes ways to integrate the system into machine learning to solve the problems faced by existing MLOps and improve productivity.
arXiv Detail & Related papers (2024-05-16T05:36:28Z)
OAEI Machine Learning Dataset for Online Model Generation [0.6472397166280683]
Ontology and knowledge graph matching systems are evaluated annually by the Ontology Alignment Evaluation Initiative (OAEI) We introduce a dataset that contains training, validation, and test sets for most of the OAEI tracks.
arXiv Detail & Related papers (2024-04-29T09:33:53Z)
AutoML-GPT: Large Language Model for AutoML [5.9145212342776805]
We have established a framework called AutoML-GPT that integrates a comprehensive set of tools and libraries. Through a conversational interface, users can specify their requirements, constraints, and evaluation metrics. We have demonstrated that AutoML-GPT significantly reduces the time and effort required for machine learning tasks.
arXiv Detail & Related papers (2023-09-03T09:39:49Z)
TSGM: A Flexible Framework for Generative Modeling of Synthetic Time Series [61.436361263605114]
Time series data are often scarce or highly sensitive, which precludes the sharing of data between researchers and industrial organizations. We introduce Time Series Generative Modeling (TSGM), an open-source framework for the generative modeling of synthetic time series.
arXiv Detail & Related papers (2023-05-19T10:11:21Z)
Benchmarking Automated Machine Learning Methods for Price Forecasting Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions. Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part. We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Enabling Un-/Semi-Supervised Machine Learning for MDSE of the Real-World CPS/IoT Applications [0.5156484100374059]
We propose a novel approach to support domain-specific Model-Driven Software Engineering (MDSE) for the real-world use-case scenarios of smart Cyber-Physical Systems (CPS) and the Internet of Things (IoT) We argue that the majority of available data in the nature for Artificial Intelligence (AI) are unlabeled. Hence, unsupervised and/or semi-supervised ML approaches are the practical choices. Our proposed approach is fully implemented and integrated with an existing state-of-the-art MDSE tool to serve the CPS/IoT domain.
arXiv Detail & Related papers (2021-07-06T15:51:39Z)
Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time. The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z)
A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions. Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data. Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology [53.063411515511056]
We propose a process model for the development of machine learning applications. The first phase combines business and data understanding as data availability oftentimes affects the feasibility of the project. The sixth phase covers state-of-the-art approaches for monitoring and maintenance of a machine learning applications.
arXiv Detail & Related papers (2020-03-11T08:25:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.