Related papers: Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective

Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective

URL: http://arxiv.org/abs/2302.04810v2
Date: Mon, 9 Oct 2023 16:31:46 GMT
Title: Real-world Machine Learning Systems: A survey from a Data-Oriented Architecture Perspective
Authors: Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, Neil D. Lawrence
Abstract summary: Data-oriented Architecture (DOA) is an emerging concept that equips systems better for integrating ML models. DOA extends current architectures to create data-driven, loosely coupled, decentralised, open systems. This paper answers these questions by surveying real-world deployments of ML-based systems.
Score: 7.574538335342942
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine Learning models are being deployed as parts of real-world systems with the upsurge of interest in artificial intelligence. The design, implementation, and maintenance of such systems are challenged by real-world environments that produce larger amounts of heterogeneous data and users requiring increasingly faster responses with efficient resource consumption. These requirements push prevalent software architectures to the limit when deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging concept that equips systems better for integrating ML models. DOA extends current architectures to create data-driven, loosely coupled, decentralised, open systems. Even though papers on deployed ML-based systems do not mention DOA, their authors made design decisions that implicitly follow DOA. The reasons why, how, and the extent to which DOA is adopted in these systems are unclear. Implicit design decisions limit the practitioners' knowledge of DOA to design ML-based systems in the real world. This paper answers these questions by surveying real-world deployments of ML-based systems. The survey shows the design decisions of the systems and the requirements these satisfy. Based on the survey findings, we also formulate practical advice to facilitate the deployment of ML-based systems. Finally, we outline open challenges to deploying DOA-based systems that integrate ML models.

Related papers

A quantitative framework for evaluating architectural patterns in ML systems [49.1574468325115]
This study proposes a framework for quantitative assessment of architectural patterns in ML systems. We focus on scalability and performance metrics for cost-effective CPU-based inference.
arXiv Detail & Related papers (2025-01-20T15:30:09Z)
Towards Human-Guided, Data-Centric LLM Co-Pilots [53.35493881390917]
CliMB-DC is a human-guided, data-centric framework for machine learning co-pilots. It combines advanced data-centric tools with LLM-driven reasoning to enable robust, context-aware data processing. We show how CliMB-DC can transform uncurated datasets into ML-ready formats.
arXiv Detail & Related papers (2025-01-17T17:51:22Z)
A Large-Scale Study of Model Integration in ML-Enabled Software Systems [4.776073133338119]
Machine learning (ML) and its embedding in systems has drastically changed the engineering of software-intensive systems. Traditionally, software engineering focuses on manually created artifacts such as source code and the process of creating them. We present the first large-scale study of real ML-enabled software systems, covering over 2,928 open source systems on GitHub.
arXiv Detail & Related papers (2024-08-12T15:28:40Z)
Machine Learning-Enabled Software and System Architecture Frameworks [48.87872564630711]
The stakeholders with data science and Machine Learning related concerns, such as data scientists and data engineers, are yet to be included in existing architecture frameworks. We surveyed 61 subject matter experts from over 25 organizations in 10 countries.
arXiv Detail & Related papers (2023-08-09T21:54:34Z)
Understanding the Complexity and Its Impact on Testing in ML-Enabled Systems [8.630445165405606]
We study Rasa 3.0, an industrial dialogue system that has been widely adopted by various companies around the world. Our goal is to characterize the complexity of such a largescale ML-enabled system and to understand the impact of the complexity on testing. Our study reveals practical implications for software engineering for ML-enabled systems.
arXiv Detail & Related papers (2023-01-10T08:13:24Z)
Is a Modular Architecture Enough? [80.32451720642209]
We provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions. We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems.
arXiv Detail & Related papers (2022-06-06T16:12:06Z)
Tiny Robot Learning: Challenges and Directions for Machine Learning in Resource-Constrained Robots [57.27442333662654]
Machine learning (ML) has become a pervasive tool across computing systems. Tiny robot learning is the deployment of ML on resource-constrained low-cost autonomous robots. Tiny robot learning is subject to challenges from size, weight, area, and power (SWAP) constraints. This paper gives a brief survey of the tiny robot learning space, elaborates on key challenges, and proposes promising opportunities for future work in ML system design.
arXiv Detail & Related papers (2022-05-11T19:36:15Z)
Retrieval-Enhanced Machine Learning [110.5237983180089]
We describe a generic retrieval-enhanced machine learning framework, which includes a number of existing models as special cases. REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization. REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-05-02T21:42:45Z)
An Empirical Evaluation of Flow Based Programming in the Machine Learning Deployment Context [11.028123436097616]
Data Oriented Architecture (DOA) is an emerging approach that can support data scientists and software developers when addressing challenges. This paper proposes to consider Flow-Based Programming (FBP) as a paradigm for creating DOA applications. We empirically evaluate FBP in the context of ML deployment on four applications that represent typical data science projects.
arXiv Detail & Related papers (2022-04-27T09:08:48Z)
Characterizing and Detecting Mismatch in Machine-Learning-Enabled Systems [1.4695979686066065]
Development and deployment of machine learning systems remains a challenge. In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development.
arXiv Detail & Related papers (2021-03-25T19:40:29Z)
A Survey of Machine Learning for Computer Architecture and Systems [18.620218353713476]
It has been a long time that computer architecture and systems are optimized to enable efficient execution of machine learning (ML) algorithms or models. Now, it is time to reconsider the relationship between ML and systems, and let ML transform the way that computer architecture and systems are designed.
arXiv Detail & Related papers (2021-02-16T04:09:57Z)
Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. We have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)
Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results. We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.