Related papers: Reasonable Scale Machine Learning with Open-Source Metaflow

Reasonable Scale Machine Learning with Open-Source Metaflow

URL: http://arxiv.org/abs/2303.11761v1
Date: Tue, 21 Mar 2023 11:28:09 GMT
Title: Reasonable Scale Machine Learning with Open-Source Metaflow
Authors: Jacopo Tagliabue, Hugo Bowne-Anderson, Ville Tuulos, Savin Goyal, Romain Cledat, David Berg
Abstract summary: We argue that re-purposing existing tools won't solve the current productivity issues. We introduce Metaflow, an open-source framework for ML projects explicitly designed to boost the productivity of data practitioners.
Score: 2.637746074346334
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As Machine Learning (ML) gains adoption across industries and new use cases, practitioners increasingly realize the challenges around effectively developing and iterating on ML systems: reproducibility, debugging, scalability, and documentation are elusive goals for real-world pipelines outside tech-first companies. In this paper, we review the nature of ML-oriented workloads and argue that re-purposing existing tools won't solve the current productivity issues, as ML peculiarities warrant specialized development tooling. We then introduce Metaflow, an open-source framework for ML projects explicitly designed to boost the productivity of data practitioners by abstracting away the execution of ML code from the definition of the business logic. We show how our design addresses the main challenges in ML operations (MLOps), and document through examples, interviews and use cases its practical impact on the field.

Related papers

Define-ML: An Approach to Ideate Machine Learning-Enabled Systems [1.3541839896498067]
Machine learning (ML) in software systems demands specialized ideation approaches.<n>Traditional ideation methods like Lean Inception lack structured support for ML considerations.<n>This paper presents Define-ML, a framework that extends Lean Inception with tailored activities.
arXiv Detail & Related papers (2025-06-25T17:11:26Z)
Large Language Models for Constructing and Optimizing Machine Learning Workflows: A Survey [3.340984908213717]
Building effective machine learning (ML) to address complex tasks is a primary focus of the Automatic ML (AutoML) community. Recently, the integration of Large Language Models (LLMs) into ML has shown great potential for automating and enhancing various stages of the ML pipeline.
arXiv Detail & Related papers (2024-11-11T21:54:26Z)
Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user. To scale up the scope of the tools, we next propose a black-box probing method. For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z)
From Summary to Action: Enhancing Large Language Models for Complex Tasks with Open World APIs [62.496139001509114]
We introduce a novel tool invocation pipeline designed to control massive real-world APIs. This pipeline mirrors the human task-solving process, addressing complicated real-life user queries. Empirical evaluations of our Sum2Act pipeline on the ToolBench benchmark show significant performance improvements.
arXiv Detail & Related papers (2024-02-28T08:42:23Z)
Towards an MLOps Architecture for XAI in Industrial Applications [2.0457031151514977]
Machine learning (ML) has become a popular tool in the industrial sector as it helps to improve operations, increase efficiency, and reduce costs. One of the remaining Machine Learning Operations (MLOps) challenges is the need for explanations. We developed a novel MLOps software architecture to address the challenge of integrating explanations and feedback capabilities into the ML development and deployment processes.
arXiv Detail & Related papers (2023-09-22T09:56:25Z)
CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability. We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization. We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z)
MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks [31.733088105662876]
We aim to bridge the gap between machine intelligence and human knowledge by introducing a novel framework. We showcase the possibility of extending the capability of LLMs to comprehend structured inputs and perform thorough reasoning for solving novel ML tasks.
arXiv Detail & Related papers (2023-04-28T17:03:57Z)
Benchmarking Automated Machine Learning Methods for Price Forecasting Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions. Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part. We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z)
Machine Learning Operations (MLOps): Overview, Definition, and Architecture [0.0]
The paradigm of Machine Learning Operations (MLOps) addresses this issue. MLOps is still a vague term and its consequences for researchers and professionals are ambiguous. We provide an aggregated overview of the necessary components, and roles, as well as the associated architecture and principles.
arXiv Detail & Related papers (2022-05-04T19:38:48Z)
Exploring the potential of flow-based programming for machine learning deployment in comparison with service-oriented architectures [8.677012233188968]
We argue that part of the reason is infrastructure that was not designed for activities around data collection and analysis. We propose to consider flow-based programming with data streams as an alternative to commonly used service-oriented architectures for building software applications.
arXiv Detail & Related papers (2021-08-09T15:06:02Z)
Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. We have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)
Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results. We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.