Related papers: The Untold Impact of Learning Approaches on Software Fault-Proneness Predictions

The Untold Impact of Learning Approaches on Software Fault-Proneness Predictions

URL: http://arxiv.org/abs/2207.05710v1
Date: Tue, 12 Jul 2022 17:31:55 GMT
Title: The Untold Impact of Learning Approaches on Software Fault-Proneness Predictions
Authors: Mohammad Jamil Ahmad, Katerina Goseva-Popstojanova and Robyn R. Lutz
Abstract summary: This paper explores the effects of two learning approaches, useAllPredictAll and usePrePredictPost, on the performance of software fault-proneness prediction. Using useAllPredictAll leads to significantly better performance than using usePrePredictPost, both within-release and across-releases.
Score: 2.01747440427135
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Software fault-proneness prediction is an active research area, with many factors affecting prediction performance extensively studied. However, the impact of the learning approach (i.e., the specifics of the data used for training and the target variable being predicted) on the prediction performance has not been studied, except for one initial work. This paper explores the effects of two learning approaches, useAllPredictAll and usePrePredictPost, on the performance of software fault-proneness prediction, both within-release and across-releases. The empirical results are based on data extracted from 64 releases of twelve open-source projects. Results show that the learning approach has a substantial, and typically unacknowledged, impact on the classification performance. Specifically, using useAllPredictAll leads to significantly better performance than using usePrePredictPost learning approach, both within-release and across-releases. Furthermore, this paper uncovers that, for within-release predictions, this difference in classification performance is due to different levels of class imbalance in the two learning approaches. When class imbalance is addressed, the performance difference between the learning approaches is eliminated. Our findings imply that the learning approach should always be explicitly identified and its impact on software fault-proneness prediction considered. The paper concludes with a discussion of potential consequences of our results for both research and practice.

Related papers

Explainable AI and Machine Learning for Exam-based Student Evaluation: Causal and Predictive Analysis of Socio-academic and Economic Factors [1.2163458046014015]
Academic performance depends on a multivariable nexus of socio-academic and financial factors.<n>This study investigates these influences to develop effective strategies for optimizing students' CGPA.
arXiv Detail & Related papers (2025-08-01T17:09:49Z)
Machine Learning-Driven Student Performance Prediction for Enhancing Tiered Instruction [11.564820268803619]
Student performance prediction is one of the most important subjects in educational data mining. Despite extensive prediction experiments, machine learning methods have not been effectively integrated into practical teaching strategies. This study integrates the results of machine learning-based student performance prediction with tiered instruction, aiming to enhance student outcomes in target course.
arXiv Detail & Related papers (2025-02-05T13:13:25Z)
Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box. In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z)
Variance of ML-based software fault predictors: are we really improving fault prediction? [0.3222802562733786]
We experimentally analyze the variance of a state-of-the-art fault prediction approach. We observed a maximum variance of 10.10% in terms of the per-class accuracy metric.
arXiv Detail & Related papers (2023-10-26T09:31:32Z)
ASPEST: Bridging the Gap Between Active Learning and Selective Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain. Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples. In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z)
Prediction of Dilatory Behavior in eLearning: A Comparison of Multiple Machine Learning Models [0.2963240482383777]
Procrastination, the irrational delay of tasks, is a common occurrence in online learning. Research focusing on such predictions is scarce. Studies involving different types of predictors and comparisons between the predictive performance of various methods are virtually non-existent.
arXiv Detail & Related papers (2022-06-30T07:24:08Z)
Learning Predictions for Algorithms with Predictions [49.341241064279714]
We introduce a general design approach for algorithms that learn predictors. We apply techniques from online learning to learn against adversarial instances, tune robustness-consistency trade-offs, and obtain new statistical guarantees. We demonstrate the effectiveness of our approach at deriving learning algorithms by analyzing methods for bipartite matching, page migration, ski-rental, and job scheduling.
arXiv Detail & Related papers (2022-02-18T17:25:43Z)
Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions. We make robust and efficient counterfactual predictions for both individual and average treatment effects. The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z)
A framework for predicting, interpreting, and improving Learning Outcomes [0.0]
We develop an Embibe Score Quotient model (ESQ) to predict test scores based on observed academic, behavioral and test-taking features of a student. ESQ can be used to predict the future scoring potential of a student as well as offer personalized learning nudges.
arXiv Detail & Related papers (2020-10-06T11:22:27Z)
Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from the First Week's Activities [56.1344233010643]
Several features are considered to contribute towards learner attrition or lack of interest, which may lead to disengagement or total dropout. This study aims to predict dropout early-on, from the first week, by comparing several machine-learning approaches.
arXiv Detail & Related papers (2020-08-12T10:44:49Z)
Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data. We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.