Detecting Concept Drift for the reliability prediction of Software
Defects using Instance Interpretation
- URL: http://arxiv.org/abs/2305.16323v1
- Date: Sat, 6 May 2023 07:50:12 GMT
- Title: Detecting Concept Drift for the reliability prediction of Software
Defects using Instance Interpretation
- Authors: Zeynab Chitsazian, Saeed Sedighian Kashi, Amin Nikanjam
- Abstract summary: Concept drift (CD) can occur due to changes in the software development process, the complexity of the software, or changes in user behavior.
We aim to develop a reliable JIT-SDP model using CD point detection directly by identifying changes in the interpretation of unlabeled simplified and resampled data.
- Score: 4.039245878626346
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In the context of Just-In-Time Software Defect Prediction (JIT-SDP), Concept
drift (CD) can occur due to changes in the software development process, the
complexity of the software, or changes in user behavior that may affect the
stability of the JIT-SDP model over time. Additionally, the challenge of class
imbalance in JIT-SDP data poses a potential risk to the accuracy of CD
detection methods if rebalancing is implemented. This issue has not been
explored to the best of our knowledge. Furthermore, methods to check the
stability of JIT-SDP models over time by considering labeled evaluation data
have been proposed. However, it should be noted that future data labels may not
always be available promptly. We aim to develop a reliable JIT-SDP model using
CD point detection directly by identifying changes in the interpretation of
unlabeled simplified and resampled data. To evaluate our approach, we first
obtained baseline methods based on model performance monitoring to identify CD
points on labeled data. We then compared the output of the proposed methods
with baseline methods based on performance monitoring of threshold-dependent
and threshold-independent criteria using well-known performance measures in CD
detection methods, such as accuracy, MDR, MTD, MTFA, and MTR. We also utilize
the Friedman statistical test to assess the effectiveness of our approach. As a
result, our proposed methods show higher compatibility with baseline methods
based on threshold-independent criteria when applied to rebalanced data, and
with baseline methods based on threshold-dependent criteria when applied to
simple data.
Related papers
- Bisimulation metric for Model Predictive Control [44.301098448479195]
Bisimulation Metric for Model Predictive Control (BS-MPC) is a novel approach that incorporates bisimulation metric loss in its objective function to directly optimize the encoder.
BS-MPC improves training stability, robustness against input noise, and computational efficiency by reducing training time.
We evaluate BS-MPC on both continuous control and image-based tasks from the DeepMind Control Suite.
arXiv Detail & Related papers (2024-10-06T17:12:10Z) - Source-Free Domain-Invariant Performance Prediction [68.39031800809553]
We propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data.
Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability.
Our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.
arXiv Detail & Related papers (2024-08-05T03:18:58Z) - Stratified Prediction-Powered Inference for Hybrid Language Model Evaluation [62.2436697657307]
Prediction-powered inference (PPI) is a method that improves statistical estimates based on limited human-labeled data.
We propose a method called Stratified Prediction-Powered Inference (StratPPI)
We show that the basic PPI estimates can be considerably improved by employing simple data stratification strategies.
arXiv Detail & Related papers (2024-06-06T17:37:39Z) - Guiding Pseudo-labels with Uncertainty Estimation for Test-Time
Adaptation [27.233704767025174]
Test-Time Adaptation (TTA) is a specific case of Unsupervised Domain Adaptation (UDA) where a model is adapted to a target domain without access to source data.
We propose a novel approach for the TTA setting based on a loss reweighting strategy that brings robustness against the noise that inevitably affects the pseudo-labels.
arXiv Detail & Related papers (2023-03-07T10:04:55Z) - MAPS: A Noise-Robust Progressive Learning Approach for Source-Free
Domain Adaptive Keypoint Detection [76.97324120775475]
Cross-domain keypoint detection methods always require accessing the source data during adaptation.
This paper considers source-free domain adaptive keypoint detection, where only the well-trained source model is provided to the target domain.
arXiv Detail & Related papers (2023-02-09T12:06:08Z) - Estimating Model Performance under Domain Shifts with Class-Specific
Confidence Scores [25.162667593654206]
We introduce class-wise calibration within the framework of performance estimation for imbalanced datasets.
We conduct experiments on four tasks and find the proposed modifications consistently improve the estimation accuracy for imbalanced datasets.
arXiv Detail & Related papers (2022-07-20T15:04:32Z) - Robust Anytime Learning of Markov Decision Processes [8.799182983019557]
In data-driven applications, deriving precise probabilities from limited data introduces statistical errors.
Uncertain MDPs (uMDPs) do not require precise probabilities but instead use so-called uncertainty sets in the transitions.
We propose a robust anytime-learning approach that combines a dedicated Bayesian inference scheme with the computation of robust policies.
arXiv Detail & Related papers (2022-05-31T14:29:55Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - How Training Data Impacts Performance in Learning-based Control [67.7875109298865]
This paper derives an analytical relationship between the density of the training data and the control performance.
We formulate a quality measure for the data set, which we refer to as $rho$-gap.
We show how the $rho$-gap can be applied to a feedback linearizing control law.
arXiv Detail & Related papers (2020-05-25T12:13:49Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.