Effective Fault Localization using Probabilistic and Grouping Approach
- URL: http://arxiv.org/abs/2403.05022v1
- Date: Fri, 8 Mar 2024 03:55:09 GMT
- Title: Effective Fault Localization using Probabilistic and Grouping Approach
- Authors: Saksham Sahai Srivastava, Arpita Dutta, Rajib Mall
- Abstract summary: The aim of this paper is to use the conception of conditional probability to design an effective fault localization technique.
We present a fault localization technique that derives the association between statement coverage information and test case execution result.
We evaluate the effectiveness of proposed method over eleven open-source data sets.
- Score: 0.7673339435080445
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Context: Fault localization (FL) is the key activity while debugging a
program. Any improvement to this activity leads to significant improvement in
total software development cost. There is an internal linkage between the
program spectrum and test execution result. Conditional probability in
statistics captures the probability of occurring one event in relationship to
one or more other events. Objectives: The aim of this paper is to use the
conception of conditional probability to design an effective fault localization
technique. Methods: In the paper, we present a fault localization technique
that derives the association between statement coverage information and test
case execution result using condition probability statistics. This association
with the failed test case result shows the fault containing the probability of
that specific statement. Subsequently, we use a grouping method to refine the
obtained statement ranking sequence for better fault localization. Results: We
evaluated the effectiveness of proposed method over eleven open-source data
sets. Our obtained results show that on average, the proposed CGFL method is
24.56% more effective than other contemporary fault localization methods such
as D*, Tarantula, Ochiai, Crosstab, BPNN, RBFNN, DNN, and CNN. Conclusion: We
devised an effective fault localization technique by combining the conditional
probabilistic method with failed test case execution-based approach. Our
experimental evaluation shows our proposed method outperforms the existing
fault localization techniques.
Related papers
- Exogenous Matching: Learning Good Proposals for Tractable Counterfactual Estimation [1.9662978733004601]
We propose an importance sampling method for tractable and efficient estimation of counterfactual expressions.
By minimizing a common upper bound of counterfactual estimators, we transform the variance minimization problem into a conditional distribution learning problem.
We validate the theoretical results through experiments under various types and settings of Structural Causal Models (SCMs) and demonstrate the outperformance on counterfactual estimation tasks.
arXiv Detail & Related papers (2024-10-17T03:08:28Z) - Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method [108.56493934296687]
We introduce a divergence-based calibration method, inspired by the divergence-from-randomness concept, to calibrate token probabilities for pretraining data detection.
We have developed a Chinese-language benchmark, PatentMIA, to assess the performance of detection approaches for LLMs on Chinese text.
arXiv Detail & Related papers (2024-09-23T07:55:35Z) - DRAUC: An Instance-wise Distributionally Robust AUC Optimization
Framework [133.26230331320963]
Area Under the ROC Curve (AUC) is a widely employed metric in long-tailed classification scenarios.
We propose an instance-wise surrogate loss of Distributionally Robust AUC (DRAUC) and build our optimization framework on top of it.
arXiv Detail & Related papers (2023-11-06T12:15:57Z) - Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM)
Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain.
We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z) - Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment
Effect Estimation [137.3520153445413]
A notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference.
We evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets.
The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes.
arXiv Detail & Related papers (2023-07-11T02:58:10Z) - B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under
Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding.
We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Test case prioritization using test case diversification and
fault-proneness estimations [0.0]
We propose an approach for TCP that takes into account test case coverage data, bug history, and test case diversification.
The diversification of test cases is preserved by incorporating fault-proneness on a clustering-based approach scheme.
The experiments show that the proposed methods are superior to coverage-based TCP methods.
arXiv Detail & Related papers (2021-06-19T15:55:24Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Selective Probabilistic Classifier Based on Hypothesis Testing [14.695979686066066]
We propose a simple yet effective method to deal with the violation of the Closed-World Assumption for a classifier.
The proposed method is a rejection option based on hypothesis testing with probabilistic networks.
It is shown that the proposed method can achieve a broader range of operation and cover a lower False Positive Ratio than the alternative.
arXiv Detail & Related papers (2021-05-09T08:55:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.