The Untold Impact of Learning Approaches on Software Fault-Proneness
Predictions
- URL: http://arxiv.org/abs/2207.05710v1
- Date: Tue, 12 Jul 2022 17:31:55 GMT
- Title: The Untold Impact of Learning Approaches on Software Fault-Proneness
Predictions
- Authors: Mohammad Jamil Ahmad, Katerina Goseva-Popstojanova and Robyn R. Lutz
- Abstract summary: This paper explores the effects of two learning approaches, useAllPredictAll and usePrePredictPost, on the performance of software fault-proneness prediction.
Using useAllPredictAll leads to significantly better performance than using usePrePredictPost, both within-release and across-releases.
- Score: 2.01747440427135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software fault-proneness prediction is an active research area, with many
factors affecting prediction performance extensively studied. However, the
impact of the learning approach (i.e., the specifics of the data used for
training and the target variable being predicted) on the prediction performance
has not been studied, except for one initial work. This paper explores the
effects of two learning approaches, useAllPredictAll and usePrePredictPost, on
the performance of software fault-proneness prediction, both within-release and
across-releases. The empirical results are based on data extracted from 64
releases of twelve open-source projects. Results show that the learning
approach has a substantial, and typically unacknowledged, impact on the
classification performance. Specifically, using useAllPredictAll leads to
significantly better performance than using usePrePredictPost learning
approach, both within-release and across-releases. Furthermore, this paper
uncovers that, for within-release predictions, this difference in
classification performance is due to different levels of class imbalance in the
two learning approaches. When class imbalance is addressed, the performance
difference between the learning approaches is eliminated. Our findings imply
that the learning approach should always be explicitly identified and its
impact on software fault-proneness prediction considered. The paper concludes
with a discussion of potential consequences of our results for both research
and practice.
Related papers
- Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Variance of ML-based software fault predictors: are we really improving
fault prediction? [0.3222802562733786]
We experimentally analyze the variance of a state-of-the-art fault prediction approach.
We observed a maximum variance of 10.10% in terms of the per-class accuracy metric.
arXiv Detail & Related papers (2023-10-26T09:31:32Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Prediction of Dilatory Behavior in eLearning: A Comparison of Multiple
Machine Learning Models [0.2963240482383777]
Procrastination, the irrational delay of tasks, is a common occurrence in online learning.
Research focusing on such predictions is scarce.
Studies involving different types of predictors and comparisons between the predictive performance of various methods are virtually non-existent.
arXiv Detail & Related papers (2022-06-30T07:24:08Z) - Learning Predictions for Algorithms with Predictions [49.341241064279714]
We introduce a general design approach for algorithms that learn predictors.
We apply techniques from online learning to learn against adversarial instances, tune robustness-consistency trade-offs, and obtain new statistical guarantees.
We demonstrate the effectiveness of our approach at deriving learning algorithms by analyzing methods for bipartite matching, page migration, ski-rental, and job scheduling.
arXiv Detail & Related papers (2022-02-18T17:25:43Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - A framework for predicting, interpreting, and improving Learning
Outcomes [0.0]
We develop an Embibe Score Quotient model (ESQ) to predict test scores based on observed academic, behavioral and test-taking features of a student.
ESQ can be used to predict the future scoring potential of a student as well as offer personalized learning nudges.
arXiv Detail & Related papers (2020-10-06T11:22:27Z) - Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from
the First Week's Activities [56.1344233010643]
Several features are considered to contribute towards learner attrition or lack of interest, which may lead to disengagement or total dropout.
This study aims to predict dropout early-on, from the first week, by comparing several machine-learning approaches.
arXiv Detail & Related papers (2020-08-12T10:44:49Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.