Breaking New Ground in Software Defect Prediction: Introducing Practical and Actionable Metrics with Superior Predictive Power for Enhanced Decision-Making
- URL: http://arxiv.org/abs/2508.04408v1
- Date: Wed, 06 Aug 2025 12:52:13 GMT
- Title: Breaking New Ground in Software Defect Prediction: Introducing Practical and Actionable Metrics with Superior Predictive Power for Enhanced Decision-Making
- Authors: Carlos Andrés Ramírez Cataño, Makoto Itoh,
- Abstract summary: This paper explores automated software defect prediction at the method level based on the developers' coding habits.<n>We present a systematic approach to forecasting defect-prone software methods via a human error framework.
- Score: 0.8287206589886879
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Software defect prediction using code metrics has been extensively researched over the past five decades. However, prediction harnessing non-software metrics is under-researched. Considering that the root cause of software defects is often attributed to human error, human factors theory might offer key forecasting metrics for actionable insights. This paper explores automated software defect prediction at the method level based on the developers' coding habits. First, we propose a framework for deciding the metrics to conduct predictions. Next, we compare the performance of our metrics to that of the code and commit history metrics shown by research to achieve the highest performance to date. Finally, we analyze the prediction importance of each metric. As a result of our analyses of twenty-one critical infrastructure large-scale open-source software projects, we have presented: (1) a human error-based framework with metrics useful for defect prediction at method level; (2) models using our proposed metrics achieve better average prediction performance than the state-of-the-art code metrics and history measures; (3) the prediction importance of all metrics distributes differently with each of the novel metrics having better average importance than code and history metrics; (4) the novel metrics dramatically enhance the explainability, practicality, and actionability of software defect prediction models, significantly advancing the field. We present a systematic approach to forecasting defect-prone software methods via a human error framework. This work empowers practitioners to act on predictions, empirically demonstrating how developer coding habits contribute to defects in software systems.
Related papers
- Position: All Current Generative Fidelity and Diversity Metrics are Flawed [58.815519650465774]
We show that all current generative fidelity and diversity metrics are flawed.<n>Our aim is to convince the research community to spend more effort in developing metrics, instead of models.
arXiv Detail & Related papers (2025-05-28T15:10:33Z) - Object Oriented-Based Metrics to Predict Fault Proneness in Software Design [1.194799054956877]
We look at the relationship between object-oriented software metrics and their implications on fault proneness.<n>Studies indicate that object-oriented metrics are indeed a good predictor of software fault proneness.
arXiv Detail & Related papers (2025-04-11T03:29:15Z) - Predicting post-release defects with knowledge units (KUs) of programming languages: an empirical study [25.96111422428881]
Defect prediction plays a crucial role in software engineering, enabling developers to identify defect-prone code and improve software quality.<n>To address this gap, we introduce Knowledge Units (KUs) of programming languages as a novel feature set for analyzing software systems and defect prediction.<n>A KU is a cohesive set of key capabilities that are offered by one or more building blocks of a given programming language.
arXiv Detail & Related papers (2024-12-03T23:22:06Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Method-Level Bug Severity Prediction using Source Code Metrics and LLMs [0.628122931748758]
We investigate source code metrics, source code representation using large language models (LLMs), and their combination in predicting bug severity labels.
Our results suggest that Decision Tree and Random Forest models outperform other models regarding our several evaluation metrics.
CodeBERT finetuning improves the bug severity prediction results significantly in the range of 29%-140% for several evaluation metrics.
arXiv Detail & Related papers (2023-09-06T14:38:07Z) - Defect Prediction Using Stylistic Metrics [2.286041284499166]
This paper aims at analyzing the impact of stylistic metrics on both within-project and crossproject defect prediction.
Experiment is conducted on 14 releases of 5 popular, open source projects.
arXiv Detail & Related papers (2022-06-22T10:11:05Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Learning Predictions for Algorithms with Predictions [49.341241064279714]
We introduce a general design approach for algorithms that learn predictors.
We apply techniques from online learning to learn against adversarial instances, tune robustness-consistency trade-offs, and obtain new statistical guarantees.
We demonstrate the effectiveness of our approach at deriving learning algorithms by analyzing methods for bipartite matching, page migration, ski-rental, and job scheduling.
arXiv Detail & Related papers (2022-02-18T17:25:43Z) - Robustification of Online Graph Exploration Methods [59.50307752165016]
We study a learning-augmented variant of the classical, notoriously hard online graph exploration problem.
We propose an algorithm that naturally integrates predictions into the well-known Nearest Neighbor (NN) algorithm.
arXiv Detail & Related papers (2021-12-10T10:02:31Z) - Machine Learning Techniques for Software Quality Assurance: A Survey [5.33024001730262]
We discuss various approaches in both fault prediction and test case prioritization.
Recent studies deep learning algorithms for fault prediction help to bridge the gap between programs' semantics and fault prediction features.
arXiv Detail & Related papers (2021-04-29T00:37:27Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.