Quality issues in Machine Learning Software Systems
- URL: http://arxiv.org/abs/2208.08982v2
- Date: Mon, 22 Aug 2022 17:43:10 GMT
- Title: Quality issues in Machine Learning Software Systems
- Authors: Pierre-Olivier C\^ot\'e, Amin Nikanjam, Rached Bouchoucha, Foutse
Khomh
- Abstract summary: This paper aims to investigate the characteristics of real quality issues in MLSSs from the viewpoint of practitioners.
We expect that the catalog of issues developed at this step will also help us later to identify the severity, root causes, and possible remedy for quality issues of MLSSs.
- Score: 12.655311590103238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context: An increasing demand is observed in various domains to employ
Machine Learning (ML) for solving complex problems. ML models are implemented
as software components and deployed in Machine Learning Software Systems
(MLSSs). Problem: There is a strong need for ensuring the serving quality of
MLSSs. False or poor decisions of such systems can lead to malfunction of other
systems, significant financial losses, or even threat to human life. The
quality assurance of MLSSs is considered as a challenging task and currently is
a hot research topic. Moreover, it is important to cover all various aspects of
the quality in MLSSs. Objective: This paper aims to investigate the
characteristics of real quality issues in MLSSs from the viewpoint of
practitioners. This empirical study aims to identify a catalog of bad-practices
related to poor quality in MLSSs. Method: We plan to conduct a set of
interviews with practitioners/experts, believing that interviews are the best
method to retrieve their experience and practices when dealing with quality
issues. We expect that the catalog of issues developed at this step will also
help us later to identify the severity, root causes, and possible remedy for
quality issues of MLSSs, allowing us to develop efficient quality assurance
tools for ML models and MLSSs.
Related papers
- Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Competition-Level Problems are Effective LLM Evaluators [121.15880285283116]
This paper aims to evaluate the reasoning capacities of large language models (LLMs) in solving recent programming problems in Codeforces.
We first provide a comprehensive evaluation of GPT-4's peiceived zero-shot performance on this task, considering various aspects such as problems' release time, difficulties, and types of errors encountered.
Surprisingly, theThoughtived performance of GPT-4 has experienced a cliff like decline in problems after September 2021 consistently across all the difficulties and types of problems.
arXiv Detail & Related papers (2023-12-04T18:58:57Z) - Status Quo and Problems of Requirements Engineering for Machine
Learning: Results from an International Survey [7.164324501049983]
Requirements Engineering (RE) can help address many problems when engineering Machine Learning-enabled systems.
We conducted a survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems.
We found significant differences in RE practices within ML projects.
arXiv Detail & Related papers (2023-10-10T15:53:50Z) - Towards Self-Adaptive Machine Learning-Enabled Systems Through QoS-Aware
Model Switching [1.2277343096128712]
We propose the concept of a Machine Learning Model Balancer, focusing on managing uncertainties related to ML models by using multiple models.
AdaMLS is a novel self-adaptation approach that leverages this concept and extends the traditional MAPE-K loop for continuous MLS adaptation.
Preliminary results suggest AdaMLS surpasses naive and single state-of-the-art models in guarantees.
arXiv Detail & Related papers (2023-08-19T09:33:51Z) - A Survey on Evaluation of Large Language Models [87.60417393701331]
Large language models (LLMs) are gaining increasing popularity in both academia and industry.
This paper focuses on three key dimensions: what to evaluate, where to evaluate, and how to evaluate.
arXiv Detail & Related papers (2023-07-06T16:28:35Z) - Quality Issues in Machine Learning Software Systems [10.797981721308226]
There is a strong need for ensuring the serving quality of Machine Learning Software Systems.
This paper aims to investigate the characteristics of real quality issues in MLSSs from the viewpoint of practitioners.
We identify 18 recurring quality issues and 24 strategies to mitigate them.
arXiv Detail & Related papers (2023-06-26T18:46:46Z) - How Can Recommender Systems Benefit from Large Language Models: A Survey [82.06729592294322]
Large language models (LLM) have shown impressive general intelligence and human-like capabilities.
We conduct a comprehensive survey on this research direction from the perspective of the whole pipeline in real-world recommender systems.
arXiv Detail & Related papers (2023-06-09T11:31:50Z) - Quality Assurance Challenges for Machine Learning Software Applications
During Software Development Life Cycle Phases [1.4213973379473654]
The paper conducts an in-depth review of literature on the quality assurance of Machine Learning (ML) models.
We develop a taxonomy of MLSA quality assurance issues by mapping the various ML adoption challenges across different phases of software development life cycles (SDLC)
This mapping can help prioritize quality assurance efforts of MLSAs where the adoption of ML models can be considered crucial.
arXiv Detail & Related papers (2021-05-03T22:29:23Z) - Understanding the Usability Challenges of Machine Learning In
High-Stakes Decision Making [67.72855777115772]
Machine learning (ML) is being applied to a diverse and ever-growing set of domains.
In many cases, domain experts -- who often have no expertise in ML or data science -- are asked to use ML predictions to make high-stakes decisions.
We investigate the ML usability challenges present in the domain of child welfare screening through a series of collaborations with child welfare screeners.
arXiv Detail & Related papers (2021-03-02T22:50:45Z) - Towards Guidelines for Assessing Qualities of Machine Learning Systems [1.715032913622871]
This article presents the construction of a quality model for an ML system based on an industrial use case.
In the future, we want to learn how the term quality differs between different types of ML systems.
arXiv Detail & Related papers (2020-08-25T13:45:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.