A Preliminary Investigation of MLOps Practices in GitHub
- URL: http://arxiv.org/abs/2209.11453v1
- Date: Fri, 23 Sep 2022 07:29:56 GMT
- Title: A Preliminary Investigation of MLOps Practices in GitHub
- Authors: Fabio Calefato, Filippo Lanubile, Luigi Quaranta
- Abstract summary: Machine learning applications have led to an increasing interest in MLOps.
We present an initial investigation of the MLOps practices implemented in a set of ML-enabled systems retrieved from GitHub.
- Score: 10.190501703364234
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Background. The rapid and growing popularity of machine learning (ML)
applications has led to an increasing interest in MLOps, that is, the practice
of continuous integration and deployment (CI/CD) of ML-enabled systems. Aims.
Since changes may affect not only the code but also the ML model parameters and
the data themselves, the automation of traditional CI/CD needs to be extended
to manage model retraining in production. Method. In this paper, we present an
initial investigation of the MLOps practices implemented in a set of ML-enabled
systems retrieved from GitHub, focusing on GitHub Actions and CML, two
solutions to automate the development workflow. Results. Our preliminary
results suggest that the adoption of MLOps workflows in open-source GitHub
projects is currently rather limited. Conclusions. Issues are also identified,
which can guide future research work.
Related papers
- Position: A Call to Action for a Human-Centered AutoML Paradigm [83.78883610871867]
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML)
We argue that a key to unlocking AutoML's full potential lies in addressing the currently underexplored aspect of user interaction with AutoML systems.
arXiv Detail & Related papers (2024-06-05T15:05:24Z) - Are you still on track!? Catching LLM Task Drift with Activations [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users.
We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set.
We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z) - On the effectiveness of Large Language Models for GitHub Workflows [9.82254417875841]
Large Language Models (LLMs) have demonstrated their effectiveness in various software development tasks.
We perform the first comprehensive study to understand the effectiveness of LLMs on five workflow-related tasks with different levels of prompts.
Our evaluation of three state-of-art LLMs and their fine-tuned variants revealed various interesting findings on the current effectiveness and drawbacks of LLMs.
arXiv Detail & Related papers (2024-03-19T05:14:12Z) - YAMLE: Yet Another Machine Learning Environment [4.985768723667417]
YAMLE is an open-source framework that facilitates rapid prototyping and experimentation with machine learning (ML) models and methods.
YAMLE includes a command-line interface and integrations with popular and well-maintained PyTorch-based libraries.
The ambition for YAMLE is to grow into a shared ecosystem where researchers and practitioners can quickly build on and compare existing implementations.
arXiv Detail & Related papers (2024-02-09T09:34:36Z) - ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code [76.84199699772903]
ML-Bench is a benchmark rooted in real-world programming applications that leverage existing code repositories to perform tasks.
To evaluate both Large Language Models (LLMs) and AI agents, two setups are employed: ML-LLM-Bench for assessing LLMs' text-to-code conversion within a predefined deployment environment, and ML-Agent-Bench for testing autonomous agents in an end-to-end task execution within a Linux sandbox environment.
arXiv Detail & Related papers (2023-11-16T12:03:21Z) - Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE [85.76186554492543]
Large Language Models (LLMs) can extend their zero-shot capabilities to multimodal learning through instruction tuning.
negative conflicts and interference may have a worse impact on performance.
We propose a novel framework, called Octavius, for comprehensive studies and experimentation on multimodal learning with MLLMs.
arXiv Detail & Related papers (2023-11-05T15:48:29Z) - MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation [96.71370747681078]
We introduce MLAgentBench, a suite of 13 tasks ranging from improving model performance on CIFAR-10 to recent research problems like BabyLM.
For each task, an agent can perform actions like reading/writing files, executing code, and inspecting outputs.
We benchmark agents based on Claude v1.0, Claude v2.1, Claude v3 Opus, GPT-4, GPT-4-turbo, Gemini-Pro, and Mixtral and find that a Claude v3 Opus agent is the best in terms of success rate.
arXiv Detail & Related papers (2023-10-05T04:06:12Z) - MLOps: A Step Forward to Enterprise Machine Learning [0.0]
This research presents a detailed review of MLOps, its benefits, difficulties, evolutions, and important underlying technologies.
The MLOps workflow is explained in detail along with the various tools necessary for both model and data exploration and deployment.
This article also puts light on the end-to-end production of ML projects using various maturity levels of automated pipelines.
arXiv Detail & Related papers (2023-05-27T20:44:14Z) - OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques.
We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z) - Enabling Un-/Semi-Supervised Machine Learning for MDSE of the Real-World
CPS/IoT Applications [0.5156484100374059]
We propose a novel approach to support domain-specific Model-Driven Software Engineering (MDSE) for the real-world use-case scenarios of smart Cyber-Physical Systems (CPS) and the Internet of Things (IoT)
We argue that the majority of available data in the nature for Artificial Intelligence (AI) are unlabeled. Hence, unsupervised and/or semi-supervised ML approaches are the practical choices.
Our proposed approach is fully implemented and integrated with an existing state-of-the-art MDSE tool to serve the CPS/IoT domain.
arXiv Detail & Related papers (2021-07-06T15:51:39Z) - MLModelCI: An Automatic Cloud Platform for Efficient MLaaS [15.029094196394862]
We release the platform as an open-source project on GitHub under Apache 2.0 license.
Our system bridges the gap between current ML training and serving systems and thus free developers from manual and tedious work often associated with service deployment.
arXiv Detail & Related papers (2020-06-09T07:48:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.