ML-Enabled Systems Model Deployment and Monitoring: Status Quo and
Problems
- URL: http://arxiv.org/abs/2402.05333v1
- Date: Thu, 8 Feb 2024 00:25:30 GMT
- Title: ML-Enabled Systems Model Deployment and Monitoring: Status Quo and
Problems
- Authors: Eduardo Zimelewicz, Marcos Kalinowski, Daniel Mendez, G\"orkem Giray,
Antonio Pedro Santos Alves, Niklas Lavesson, Kelly Azevedo, Hugo Villamizar,
Tatiana Escovedo, Helio Lopes, Stefan Biffl, Juergen Musil, Michael Felderer,
Stefan Wagner, Teresa Baldassarre, Tony Gorschek
- Abstract summary: We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered.
We analyzed the status quo and problems reported for the model deployment and monitoring phases.
Our results help provide a better understanding of the adopted practices and problems in practice.
- Score: 7.280443300122617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: [Context] Systems incorporating Machine Learning (ML) models, often called
ML-enabled systems, have become commonplace. However, empirical evidence on how
ML-enabled systems are engineered in practice is still limited, especially for
activities surrounding ML model dissemination. [Goal] We investigate
contemporary industrial practices and problems related to ML model
dissemination, focusing on the model deployment and the monitoring of ML life
cycle phases. [Method] We conducted an international survey to gather
practitioner insights on how ML-enabled systems are engineered. We gathered a
total of 188 complete responses from 25 countries. We analyze the status quo
and problems reported for the model deployment and monitoring phases. We
analyzed contemporary practices using bootstrapping with confidence intervals
and conducted qualitative analyses on the reported problems applying open and
axial coding procedures. [Results] Practitioners perceive the model deployment
and monitoring phases as relevant and difficult. With respect to model
deployment, models are typically deployed as separate services, with limited
adoption of MLOps principles. Reported problems include difficulties in
designing the architecture of the infrastructure for production deployment and
legacy application integration. Concerning model monitoring, many models in
production are not monitored. The main monitored aspects are inputs, outputs,
and decisions. Reported problems involve the absence of monitoring practices,
the need to create custom monitoring tools, and the selection of suitable
metrics. [Conclusion] Our results help provide a better understanding of the
adopted practices and problems in practice and support guiding ML deployment
and monitoring research in a problem-driven manner.
Related papers
- Towards Trustworthy Machine Learning in Production: An Overview of the Robustness in MLOps Approach [0.0]
In recent years, AI researchers and practitioners have introduced principles and guidelines to build systems that make reliable and trustworthy decisions.
In practice, a fundamental challenge arises when the system needs to be operationalized and deployed to evolve and operate in real-life environments continuously.
To address this challenge, Machine Learning Operations (MLOps) have emerged as a potential recipe for standardizing ML solutions in deployment.
arXiv Detail & Related papers (2024-10-28T09:34:08Z) - MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Using Quality Attribute Scenarios for ML Model Test Case Generation [3.9111051646728527]
Current practice for machine learning (ML) model testing prioritizes testing for model performance.
This paper presents an approach based on quality attribute (QA) scenarios to elicit and define system- and model-relevant test cases.
The QA-based approach has been integrated into MLTE, a process and tool to support ML model test and evaluation.
arXiv Detail & Related papers (2024-06-12T18:26:42Z) - Naming the Pain in Machine Learning-Enabled Systems Engineering [8.092979562919878]
Machine learning (ML)-enabled systems are being increasingly adopted by companies.
This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems.
arXiv Detail & Related papers (2024-05-20T06:59:20Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - A Review of Physics-Informed Machine Learning Methods with Applications
to Condition Monitoring and Anomaly Detection [1.124958340749622]
PIML is the incorporation of known physical laws and constraints into machine learning algorithms.
This study presents a comprehensive overview of PIML techniques in the context of condition monitoring.
arXiv Detail & Related papers (2024-01-22T11:29:44Z) - Status Quo and Problems of Requirements Engineering for Machine
Learning: Results from an International Survey [7.164324501049983]
Requirements Engineering (RE) can help address many problems when engineering Machine Learning-enabled systems.
We conducted a survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems.
We found significant differences in RE practices within ML projects.
arXiv Detail & Related papers (2023-10-10T15:53:50Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Panoramic Learning with A Standardized Machine Learning Formalism [116.34627789412102]
This paper presents a standardized equation of the learning objective, that offers a unifying understanding of diverse ML algorithms.
It also provides guidance for mechanic design of new ML solutions, and serves as a promising vehicle towards panoramic learning with all experiences.
arXiv Detail & Related papers (2021-08-17T17:44:38Z) - Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end.
We have developed a proven systems engineering approach for machine learning development and deployment.
Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z) - Monitoring and explainability of models in production [58.720142291102135]
Monitoring deployed models is crucial for continued provision of high quality machine learning enabled services.
We discuss the challenges to successful implementation of solutions in each of these areas with some recent examples of production ready solutions using open source tools.
arXiv Detail & Related papers (2020-07-13T10:37:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.