Related papers: Building Defect Prediction Models by Online Learning Considering Defect Overlooking

Building Defect Prediction Models by Online Learning Considering Defect Overlooking

URL: http://arxiv.org/abs/2404.11033v1
Date: Wed, 17 Apr 2024 03:20:46 GMT
Title: Building Defect Prediction Models by Online Learning Considering Defect Overlooking
Authors: Nikolay Fedorov, Yuta Yamasaki, Masateru Tsunoda, Akito Monden, Amjed Tahir, Kwabena Ebo Bennin, Koji Toda, Keitaro Nakasai,
Abstract summary: Building defect prediction models based on online learning can enhance prediction accuracy. A module predicted as "non-defective" can result in fewer test cases for such modules. erroneous test results are used as learning data by online learning, which could negatively affect prediction accuracy.
Score: 1.5869998695491834
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Building defect prediction models based on online learning can enhance prediction accuracy. It continuously rebuilds a new prediction model, when a new data point is added. However, a module predicted as "non-defective" can result in fewer test cases for such modules. Thus, a defective module can be overlooked during testing. The erroneous test results are used as learning data by online learning, which could negatively affect prediction accuracy. To suppress the negative influence, we propose to apply a method that fixes the prediction as positive during the initial stage of online learning. Additionally, we improved the method to consider the probability of the overlooking. In our experiment, we demonstrate this negative influence on prediction accuracy, and the effectiveness of our approach. The results show that our approach did not negatively affect AUC but significantly improved recall.

Related papers

Online Classification with Predictions [20.291598040396302]
We study online classification when the learner has access to predictions about future examples. We show that if the learner is always guaranteed to observe data where future examples are easily predictable, then online learning can be as easy as transductive online learning.
arXiv Detail & Related papers (2024-05-22T23:45:33Z)
The Impact of Defect (Re) Prediction on Software Testing [1.5869998695491834]
Cross-project defect prediction (CPDP) aims to use data from external projects as historical data may not be available from the same project. A Bandit Algorithm (BA) based approach has been proposed in prior research to select the most suitable learning project. This study aims to improve the BA method to reduce defects overlooking, especially during the early testing stages.
arXiv Detail & Related papers (2024-04-17T03:34:13Z)
Towards Causal Deep Learning for Vulnerability Detection [31.59558109518435]
We introduce do calculus based causal learning to software engineering models. Our results show that CausalVul consistently improved the model accuracy, robustness and OOD performance.
arXiv Detail & Related papers (2023-10-12T00:51:06Z)
EANet: Expert Attention Network for Online Trajectory Prediction [5.600280639034753]
Expert Attention Network is a complete online learning framework for trajectory prediction. We introduce expert attention, which adjusts the weights of different depths of network layers, avoiding the model updated slowly due to gradient problem. Furthermore, we propose a short-term motion trend kernel function which is sensitive to scenario change, allowing the model to respond quickly.
arXiv Detail & Related papers (2023-09-11T07:09:40Z)
Software Defect Prediction by Online Learning Considering Defect Overlooking [1.655352281097533]
Building defect prediction models based on online learning can enhance prediction accuracy. It continuously rebuilds a new prediction model when adding a new data point. However, predicting a module as "non-defective" (i.e., negative prediction) can result in fewer test cases for such modules.
arXiv Detail & Related papers (2023-08-25T15:02:22Z)
Learning Sample Difficulty from Pre-trained Models for Reliable Prediction [55.77136037458667]
We propose to utilize large-scale pre-trained models to guide downstream model training with sample difficulty-aware entropy regularization. We simultaneously improve accuracy and uncertainty calibration across challenging benchmarks.
arXiv Detail & Related papers (2023-04-20T07:29:23Z)
On the Role of Negative Precedent in Legal Outcome Prediction [65.30798081417115]
Legal outcome prediction, the prediction of positive outcome, is an increasingly popular task in AI. We turn our focus to negative outcomes here, and introduce a new task of negative outcome prediction. We discover an asymmetry in existing models' ability to predict positive and negative outcomes. We develop two new models inspired by the dynamics of a court process.
arXiv Detail & Related papers (2022-08-17T11:12:50Z)
Agree to Disagree: Diversity through Disagreement for Better Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data. We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z)
Conformal prediction for the design problem [72.14982816083297]
In many real-world deployments of machine learning, we use a prediction algorithm to choose what data to test next. In such settings, there is a distinct type of distribution shift between the training and test data. We introduce a method to quantify predictive uncertainty in such settings.
arXiv Detail & Related papers (2022-02-08T02:59:12Z)
Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators. They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions. We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z)
Efficient Estimation of Influence of a Training Instance [56.29080605123304]
We propose an efficient method for estimating the influence of a training instance on a neural network model. Our method is inspired by dropout, which zero-masks a sub-network and prevents the sub-network from learning each training instance. We demonstrate that the proposed method can capture training influences, enhance the interpretability of error predictions, and cleanse the training dataset for improving generalization.
arXiv Detail & Related papers (2020-12-08T04:31:38Z)
Positive-Congruent Training: Towards Regression-Free Model Updates [87.25247195148187]
In image classification, sample-wise inconsistencies appear as "negative flips" A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model. We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model.
arXiv Detail & Related papers (2020-11-18T09:00:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.