Related papers: "Project smells" -- Experiences in Analysing the Software Quality of ML Projects with mllint

"Project smells" -- Experiences in Analysing the Software Quality of ML Projects with mllint

URL: http://arxiv.org/abs/2201.08246v1
Date: Thu, 20 Jan 2022 15:52:24 GMT
Title: "Project smells" -- Experiences in Analysing the Software Quality of ML Projects with mllint
Authors: Bart van Oort, Lu\'is Cruz, Babak Loni, Arie van Deursen
Abstract summary: We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development.
Score: 6.0141405230309335
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Machine Learning (ML) projects incur novel challenges in their development and productionisation over traditional software applications, though established principles and best practices in ensuring the project's software quality still apply. While using static analysis to catch code smells has been shown to improve software quality attributes, it is only a small piece of the software quality puzzle, especially in the case of ML projects given their additional challenges and lower degree of Software Engineering (SE) experience in the data scientists that develop them. We introduce the novel concept of project smells which consider deficits in project management as a more holistic perspective on software quality in ML projects. An open-source static analysis tool mllint was also implemented to help detect and mitigate these. Our research evaluates this novel concept of project smells in the industrial context of ING, a global bank and large software- and data-intensive organisation. We also investigate the perceived importance of these project smells for proof-of-concept versus production-ready ML projects, as well as the perceived obstructions and benefits to using static analysis tools such as mllint. Our findings indicate a need for context-aware static analysis tools, that fit the needs of the project at its current stage of development, while requiring minimal configuration effort from the user.

Related papers

Assessing LLMs for Front-end Software Architecture Knowledge [0.0]
Large Language Models (LLMs) have demonstrated significant promise in automating software development tasks. This study investigates the capabilities of an LLM in understanding, reproducing, and generating structures within the VIPER architecture. Experimental results, using ChatGPT 4 Turbo 2024-04-09, reveal that the LLM excelled in higher-order tasks like evaluating and creating, but faced challenges with lower-order tasks requiring precise retrieval of architectural details.
arXiv Detail & Related papers (2025-02-26T19:33:35Z)
Continuous Integration Practices in Machine Learning Projects: The Practitioners` Perspective [1.4165457606269516]
This study surveys 155 practitioners from 47 Machine Learning (ML) projects. Practitioners highlighted eight key differences, including test complexity, infrastructure requirements, and build duration and stability. Common challenges mentioned by practitioners include higher project complexity, model training demands, extensive data handling, increased computational resource needs, and dependency management.
arXiv Detail & Related papers (2025-02-24T18:01:50Z)
Agent-Driven Automatic Software Improvement [55.2480439325792]
This research proposal aims to explore innovative solutions by focusing on the deployment of agents powered by Large Language Models (LLMs) The iterative nature of agents, which allows for continuous learning and adaptation, can help surpass common challenges in code generation. We aim to use the iterative feedback in these systems to further fine-tune the LLMs underlying the agents, becoming better aligned to the task of automated software improvement.
arXiv Detail & Related papers (2024-06-24T15:45:22Z)
How do Machine Learning Projects use Continuous Integration Practices? An Empirical Study on GitHub Actions [1.5197353881052764]
We conduct a comprehensive analysis of 185 open-source projects on GitHub (93 ML and 92 non-ML projects) Our investigation comprises both quantitative and qualitative dimensions, aiming to uncover differences in CI adoption between ML and non-ML projects. Our findings indicate that ML projects often require longer build durations, and medium-sized ML projects exhibit lower test coverage compared to non-ML projects.
arXiv Detail & Related papers (2024-03-14T16:35:39Z)
DevBench: A Comprehensive Benchmark for Software Development [72.24266814625685]
DevBench is a benchmark that evaluates large language models (LLMs) across various stages of the software development lifecycle. Empirical studies show that current LLMs, including GPT-4-Turbo, fail to solve the challenges presented within DevBench. Our findings offer actionable insights for the future development of LLMs toward real-world programming applications.
arXiv Detail & Related papers (2024-03-13T15:13:44Z)
Quantum Computing Enhanced Service Ecosystem for Simulation in Manufacturing [56.61654656648898]
We propose a framework for a quantum computing-enhanced service ecosystem for simulation in manufacturing. We analyse two high-value use cases with the aim of a quantitative evaluation of these new computing paradigms for industrially-relevant settings.
arXiv Detail & Related papers (2024-01-19T11:04:14Z)
Code Ownership in Open-Source AI Software Security [18.779538756226298]
We use code ownership metrics to investigate the correlation with latent vulnerabilities across five prominent open-source AI software projects. The findings suggest a positive relationship between high-level ownership (characterised by a limited number of minor contributors) and a decrease in vulnerabilities. With these novel code ownership metrics, we have implemented a Python-based command-line application to aid project curators and quality assurance professionals in evaluating and benchmarking their on-site projects.
arXiv Detail & Related papers (2023-12-18T00:37:29Z)
An Empirical Study of Self-Admitted Technical Debt in Machine Learning Software [17.999512016809945]
Self-admitted technical debt (SATD) can have a significant impact on the quality of machine learning-based software. This paper aims to investigate SATD in ML code by analyzing 318 open-source ML projects across five domains, along with 318 non-ML projects.
arXiv Detail & Related papers (2023-11-20T18:56:36Z)
Using Machine Learning To Identify Software Weaknesses From Software Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications. Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z)
Benchmarking Automated Machine Learning Methods for Price Forecasting Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions. Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part. We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z)
Reasonable Scale Machine Learning with Open-Source Metaflow [2.637746074346334]
We argue that re-purposing existing tools won't solve the current productivity issues. We introduce Metaflow, an open-source framework for ML projects explicitly designed to boost the productivity of data practitioners.
arXiv Detail & Related papers (2023-03-21T11:28:09Z)
Exploring the potential of flow-based programming for machine learning deployment in comparison with service-oriented architectures [8.677012233188968]
We argue that part of the reason is infrastructure that was not designed for activities around data collection and analysis. We propose to consider flow-based programming with data streams as an alternative to commonly used service-oriented architectures for building software applications.
arXiv Detail & Related papers (2021-08-09T15:06:02Z)
Technology Readiness Levels for Machine Learning Systems [107.56979560568232]
Development and deployment of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. We have developed a proven systems engineering approach for machine learning development and deployment. Our "Machine Learning Technology Readiness Levels" framework defines a principled process to ensure robust, reliable, and responsible systems.
arXiv Detail & Related papers (2021-01-11T15:54:48Z)
Technology Readiness Levels for AI & ML [79.22051549519989]
Development of machine learning systems can be executed easily with modern tools, but the process is typically rushed and means-to-an-end. Engineering systems follow well-defined processes and testing standards to streamline development for high-quality, reliable results. We propose a proven systems engineering approach for machine learning development and deployment.
arXiv Detail & Related papers (2020-06-21T17:14:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.