Reasonable Scale Machine Learning with Open-Source Metaflow
- URL: http://arxiv.org/abs/2303.11761v1
- Date: Tue, 21 Mar 2023 11:28:09 GMT
- Title: Reasonable Scale Machine Learning with Open-Source Metaflow
- Authors: Jacopo Tagliabue, Hugo Bowne-Anderson, Ville Tuulos, Savin Goyal,
Romain Cledat, David Berg
- Abstract summary: We argue that re-purposing existing tools won't solve the current productivity issues.
We introduce Metaflow, an open-source framework for ML projects explicitly designed to boost the productivity of data practitioners.
- Score: 2.637746074346334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As Machine Learning (ML) gains adoption across industries and new use cases,
practitioners increasingly realize the challenges around effectively developing
and iterating on ML systems: reproducibility, debugging, scalability, and
documentation are elusive goals for real-world pipelines outside tech-first
companies. In this paper, we review the nature of ML-oriented workloads and
argue that re-purposing existing tools won't solve the current productivity
issues, as ML peculiarities warrant specialized development tooling. We then
introduce Metaflow, an open-source framework for ML projects explicitly
designed to boost the productivity of data practitioners by abstracting away
the execution of ML code from the definition of the business logic. We show how
our design addresses the main challenges in ML operations (MLOps), and document
through examples, interviews and use cases its practical impact on the field.
Related papers
- Large Language Models for Constructing and Optimizing Machine Learning Workflows: A Survey [3.340984908213717]
Building effective machine learning (ML) to address complex tasks is a primary focus of the Automatic ML (AutoML) community.
Recently, the integration of Large Language Models (LLMs) into ML has shown great potential for automating and enhancing various stages of the ML pipeline.
arXiv Detail & Related papers (2024-11-11T21:54:26Z) - Chain of Tools: Large Language Model is an Automatic Multi-tool Learner [54.992464510992605]
Automatic Tool Chain (ATC) is a framework that enables the large language models (LLMs) to act as a multi-tool user.
To scale up the scope of the tools, we next propose a black-box probing method.
For a comprehensive evaluation, we build a challenging benchmark named ToolFlow.
arXiv Detail & Related papers (2024-05-26T11:40:58Z) - From Summary to Action: Enhancing Large Language Models for Complex
Tasks with Open World APIs [62.496139001509114]
We introduce a novel tool invocation pipeline designed to control massive real-world APIs.
This pipeline mirrors the human task-solving process, addressing complicated real-life user queries.
Empirical evaluations of our Sum2Act pipeline on the ToolBench benchmark show significant performance improvements.
arXiv Detail & Related papers (2024-02-28T08:42:23Z) - Towards an MLOps Architecture for XAI in Industrial Applications [2.0457031151514977]
Machine learning (ML) has become a popular tool in the industrial sector as it helps to improve operations, increase efficiency, and reduce costs.
One of the remaining Machine Learning Operations (MLOps) challenges is the need for explanations.
We developed a novel MLOps software architecture to address the challenge of integrating explanations and feedback capabilities into the ML development and deployment processes.
arXiv Detail & Related papers (2023-09-22T09:56:25Z) - CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models [74.22729793816451]
Large Language Models (LLMs) have made significant progress in utilizing tools, but their ability is limited by API availability.
We propose CREATOR, a novel framework that enables LLMs to create their own tools using documentation and code realization.
We evaluate CREATOR on MATH and TabMWP benchmarks, respectively consisting of challenging math competition problems.
arXiv Detail & Related papers (2023-05-23T17:51:52Z) - MLCopilot: Unleashing the Power of Large Language Models in Solving
Machine Learning Tasks [31.733088105662876]
We aim to bridge the gap between machine intelligence and human knowledge by introducing a novel framework.
We showcase the possibility of extending the capability of LLMs to comprehend structured inputs and perform thorough reasoning for solving novel ML tasks.
arXiv Detail & Related papers (2023-04-28T17:03:57Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - Machine Learning Operations (MLOps): Overview, Definition, and
Architecture [0.0]
The paradigm of Machine Learning Operations (MLOps) addresses this issue.
MLOps is still a vague term and its consequences for researchers and professionals are ambiguous.
We provide an aggregated overview of the necessary components, and roles, as well as the associated architecture and principles.
arXiv Detail & Related papers (2022-05-04T19:38:48Z) - Exploring the potential of flow-based programming for machine learning
deployment in comparison with service-oriented architectures [8.677012233188968]
We argue that part of the reason is infrastructure that was not designed for activities around data collection and analysis.
We propose to consider flow-based programming with data streams as an alternative to commonly used service-oriented architectures for building software applications.
arXiv Detail & Related papers (2021-08-09T15:06:02Z) - Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
We have developed a proven systems engineering approach for machine learning development and deployment.
Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z) - Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results.
We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.