Software Defect Prediction by Online Learning Considering Defect
Overlooking
- URL: http://arxiv.org/abs/2308.13582v1
- Date: Fri, 25 Aug 2023 15:02:22 GMT
- Title: Software Defect Prediction by Online Learning Considering Defect
Overlooking
- Authors: Yuta Yamasaki, Nikolay Fedorov, Masateru Tsunoda, Akito Monden, Amjed
Tahir, Kwabena Ebo Bennin, Koji Toda, Keitaro Nakasai
- Abstract summary: Building defect prediction models based on online learning can enhance prediction accuracy.
It continuously rebuilds a new prediction model when adding a new data point.
However, predicting a module as "non-defective" (i.e., negative prediction) can result in fewer test cases for such modules.
- Score: 1.655352281097533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Building defect prediction models based on online learning can enhance
prediction accuracy. It continuously rebuilds a new prediction model when
adding a new data point. However, predicting a module as "non-defective" (i.e.,
negative prediction) can result in fewer test cases for such modules.
Therefore, defects can be overlooked during testing, even when the module is
defective. The erroneous test results are used as learning data by online
learning, which could negatively affect prediction accuracy. In our experiment,
we demonstrate this negative influence on prediction accuracy.
Related papers
- Online Classification with Predictions [20.291598040396302]
We study online classification when the learner has access to predictions about future examples.
We show that if the learner is always guaranteed to observe data where future examples are easily predictable, then online learning can be as easy as transductive online learning.
arXiv Detail & Related papers (2024-05-22T23:45:33Z) - The Impact of Defect (Re) Prediction on Software Testing [1.5869998695491834]
Cross-project defect prediction (CPDP) aims to use data from external projects as historical data may not be available from the same project.
A Bandit Algorithm (BA) based approach has been proposed in prior research to select the most suitable learning project.
This study aims to improve the BA method to reduce defects overlooking, especially during the early testing stages.
arXiv Detail & Related papers (2024-04-17T03:34:13Z) - Building Defect Prediction Models by Online Learning Considering Defect Overlooking [1.5869998695491834]
Building defect prediction models based on online learning can enhance prediction accuracy.
A module predicted as "non-defective" can result in fewer test cases for such modules.
erroneous test results are used as learning data by online learning, which could negatively affect prediction accuracy.
arXiv Detail & Related papers (2024-04-17T03:20:46Z) - Why does Prediction Accuracy Decrease over Time? Uncertain Positive
Learning for Cloud Failure Prediction [35.058991707881646]
We find that the prediction accuracy may decrease by about 9% after retraining the models.
Considering that the mitigation actions may result in uncertain positive instances since they cannot be verified after mitigation, which may introduce more noise while updating the prediction model.
To tackle this problem, we design an Uncertain Positive Learning Risk Estimator (Uptake) approach.
arXiv Detail & Related papers (2024-01-08T03:13:09Z) - Learning Sample Difficulty from Pre-trained Models for Reliable
Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization.
We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z) - T-Cal: An optimal test for the calibration of predictive models [49.11538724574202]
We consider detecting mis-calibration of predictive models using a finite validation dataset as a hypothesis testing problem.
detecting mis-calibration is only possible when the conditional probabilities of the classes are sufficiently smooth functions of the predictions.
We propose T-Cal, a minimax test for calibration based on a de-biased plug-in estimator of the $ell$-Expected Error (ECE)
arXiv Detail & Related papers (2022-03-03T16:58:54Z) - Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next.
In such settings, there is a distinct type of distribution shift between the training and test data.
We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Positive-Congruent Training: Towards Regression-Free Model Updates [87.25247195148187]
In image classification, sample-wise inconsistencies appear as "negative flips"
A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model.
We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model.
arXiv Detail & Related papers (2020-11-18T09:00:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.