Real-world Machine Learning Systems: A survey from a Data-Oriented
Architecture Perspective
- URL: http://arxiv.org/abs/2302.04810v2
- Date: Mon, 9 Oct 2023 16:31:46 GMT
- Title: Real-world Machine Learning Systems: A survey from a Data-Oriented
Architecture Perspective
- Authors: Christian Cabrera, Andrei Paleyes, Pierre Thodoroff, Neil D. Lawrence
- Abstract summary: Data-oriented Architecture (DOA) is an emerging concept that equips systems better for integrating ML models.
DOA extends current architectures to create data-driven, loosely coupled, decentralised, open systems.
This paper answers these questions by surveying real-world deployments of ML-based systems.
- Score: 7.574538335342942
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine Learning models are being deployed as parts of real-world systems
with the upsurge of interest in artificial intelligence. The design,
implementation, and maintenance of such systems are challenged by real-world
environments that produce larger amounts of heterogeneous data and users
requiring increasingly faster responses with efficient resource consumption.
These requirements push prevalent software architectures to the limit when
deploying ML-based systems. Data-oriented Architecture (DOA) is an emerging
concept that equips systems better for integrating ML models. DOA extends
current architectures to create data-driven, loosely coupled, decentralised,
open systems. Even though papers on deployed ML-based systems do not mention
DOA, their authors made design decisions that implicitly follow DOA. The
reasons why, how, and the extent to which DOA is adopted in these systems are
unclear. Implicit design decisions limit the practitioners' knowledge of DOA to
design ML-based systems in the real world. This paper answers these questions
by surveying real-world deployments of ML-based systems. The survey shows the
design decisions of the systems and the requirements these satisfy. Based on
the survey findings, we also formulate practical advice to facilitate the
deployment of ML-based systems. Finally, we outline open challenges to
deploying DOA-based systems that integrate ML models.
Related papers
- A Large-Scale Study of Model Integration in ML-Enabled Software Systems [4.776073133338119]
Machine learning (ML) and its embedding in systems has drastically changed the engineering of software-intensive systems.
Traditionally, software engineering focuses on manually created artifacts such as source code and the process of creating them.
We present the first large-scale study of real ML-enabled software systems, covering over 2,928 open source systems on GitHub.
arXiv Detail & Related papers (2024-08-12T15:28:40Z) - Machine Learning-Enabled Software and System Architecture Frameworks [48.87872564630711]
The stakeholders with data science and Machine Learning related concerns, such as data scientists and data engineers, are yet to be included in existing architecture frameworks.
We surveyed 61 subject matter experts from over 25 organizations in 10 countries.
arXiv Detail & Related papers (2023-08-09T21:54:34Z) - Understanding the Complexity and Its Impact on Testing in ML-Enabled
Systems [8.630445165405606]
We study Rasa 3.0, an industrial dialogue system that has been widely adopted by various companies around the world.
Our goal is to characterize the complexity of such a largescale ML-enabled system and to understand the impact of the complexity on testing.
Our study reveals practical implications for software engineering for ML-enabled systems.
arXiv Detail & Related papers (2023-01-10T08:13:24Z) - Is a Modular Architecture Enough? [80.32451720642209]
We provide a thorough assessment of common modular architectures, through the lens of simple and known modular data distributions.
We highlight the benefits of modularity and sparsity and reveal insights on the challenges faced while optimizing modular systems.
arXiv Detail & Related papers (2022-06-06T16:12:06Z) - Tiny Robot Learning: Challenges and Directions for Machine Learning in
Resource-Constrained Robots [57.27442333662654]
Machine learning (ML) has become a pervasive tool across computing systems.
Tiny robot learning is the deployment of ML on resource-constrained low-cost autonomous robots.
Tiny robot learning is subject to challenges from size, weight, area, and power (SWAP) constraints.
This paper gives a brief survey of the tiny robot learning space, elaborates on key challenges, and proposes promising opportunities for future work in ML system design.
arXiv Detail & Related papers (2022-05-11T19:36:15Z) - Retrieval-Enhanced Machine Learning [110.5237983180089]
We describe a generic retrieval-enhanced machine learning framework, which includes a number of existing models as special cases.
REML challenges information retrieval conventions, presenting opportunities for novel advances in core areas, including optimization.
REML research agenda lays a foundation for a new style of information access research and paves a path towards advancing machine learning and artificial intelligence.
arXiv Detail & Related papers (2022-05-02T21:42:45Z) - An Empirical Evaluation of Flow Based Programming in the Machine
Learning Deployment Context [11.028123436097616]
Data Oriented Architecture (DOA) is an emerging approach that can support data scientists and software developers when addressing challenges.
This paper proposes to consider Flow-Based Programming (FBP) as a paradigm for creating DOA applications.
We empirically evaluate FBP in the context of ML deployment on four applications that represent typical data science projects.
arXiv Detail & Related papers (2022-04-27T09:08:48Z) - Characterizing and Detecting Mismatch in Machine-Learning-Enabled
Systems [1.4695979686066065]
Development and deployment of machine learning systems remains a challenge.
In this paper, we report our findings and their implications for improving end-to-end ML-enabled system development.
arXiv Detail & Related papers (2021-03-25T19:40:29Z) - A Survey of Machine Learning for Computer Architecture and Systems [18.620218353713476]
It has been a long time that computer architecture and systems are optimized to enable efficient execution of machine learning (ML) algorithms or models.
Now, it is time to reconsider the relationship between ML and systems, and let ML transform the way that computer architecture and systems are designed.
arXiv Detail & Related papers (2021-02-16T04:09:57Z) - Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
We have developed a proven systems engineering approach for machine learning development and deployment.
Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z) - Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results.
We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.