Related papers: Effective Fault Localization using Probabilistic and Grouping Approach

Effective Fault Localization using Probabilistic and Grouping Approach

URL: http://arxiv.org/abs/2403.05022v1
Date: Fri, 8 Mar 2024 03:55:09 GMT
Title: Effective Fault Localization using Probabilistic and Grouping Approach
Authors: Saksham Sahai Srivastava, Arpita Dutta, Rajib Mall
Abstract summary: The aim of this paper is to use the conception of conditional probability to design an effective fault localization technique. We present a fault localization technique that derives the association between statement coverage information and test case execution result. We evaluate the effectiveness of proposed method over eleven open-source data sets.
Score: 0.7673339435080445
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Context: Fault localization (FL) is the key activity while debugging a program. Any improvement to this activity leads to significant improvement in total software development cost. There is an internal linkage between the program spectrum and test execution result. Conditional probability in statistics captures the probability of occurring one event in relationship to one or more other events. Objectives: The aim of this paper is to use the conception of conditional probability to design an effective fault localization technique. Methods: In the paper, we present a fault localization technique that derives the association between statement coverage information and test case execution result using condition probability statistics. This association with the failed test case result shows the fault containing the probability of that specific statement. Subsequently, we use a grouping method to refine the obtained statement ranking sequence for better fault localization. Results: We evaluated the effectiveness of proposed method over eleven open-source data sets. Our obtained results show that on average, the proposed CGFL method is 24.56% more effective than other contemporary fault localization methods such as D*, Tarantula, Ochiai, Crosstab, BPNN, RBFNN, DNN, and CNN. Conclusion: We devised an effective fault localization technique by combining the conditional probabilistic method with failed test case execution-based approach. Our experimental evaluation shows our proposed method outperforms the existing fault localization techniques.

Related papers

SmartFL: Semantics Based Probabilistic Fault Localization [15.481820762877897]
Testing-based fault localization has been a research focus in software engineering in the past decades. It is crucial to model program semantics in fault localization approaches. Our key idea is: by modeling only the correctness of program values but not their full semantics, a balance could be reached between effectiveness and scalability.
arXiv Detail & Related papers (2025-03-29T21:00:51Z)
Rectifying Conformity Scores for Better Conditional Coverage [75.73184036344908]
We present a new method for generating confidence sets within the split conformal prediction framework. Our method performs a trainable transformation of any given conformity score to improve conditional coverage while ensuring exact marginal coverage.
arXiv Detail & Related papers (2025-02-22T19:54:14Z)
Exogenous Matching: Learning Good Proposals for Tractable Counterfactual Estimation [1.9662978733004601]
We propose an importance sampling method for tractable and efficient estimation of counterfactual expressions. By minimizing a common upper bound of counterfactual estimators, we transform the variance minimization problem into a conditional distribution learning problem. We validate the theoretical results through experiments under various types and settings of Structural Causal Models (SCMs) and demonstrate the outperformance on counterfactual estimation tasks.
arXiv Detail & Related papers (2024-10-17T03:08:28Z)
Pretraining Data Detection for Large Language Models: A Divergence-based Calibration Method [108.56493934296687]
We introduce a divergence-based calibration method, inspired by the divergence-from-randomness concept, to calibrate token probabilities for pretraining data detection. We have developed a Chinese-language benchmark, PatentMIA, to assess the performance of detection approaches for LLMs on Chinese text.
arXiv Detail & Related papers (2024-09-23T07:55:35Z)
DRAUC: An Instance-wise Distributionally Robust AUC Optimization Framework [133.26230331320963]
Area Under the ROC Curve (AUC) is a widely employed metric in long-tailed classification scenarios. We propose an instance-wise surrogate loss of Distributionally Robust AUC (DRAUC) and build our optimization framework on top of it.
arXiv Detail & Related papers (2023-11-06T12:15:57Z)
Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM) Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain. We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z)
Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment Effect Estimation [137.3520153445413]
A notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference. We evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets. The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes.
arXiv Detail & Related papers (2023-07-11T02:58:10Z)
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding. We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Test case prioritization using test case diversification and fault-proneness estimations [0.0]
We propose an approach for TCP that takes into account test case coverage data, bug history, and test case diversification. The diversification of test cases is preserved by incorporating fault-proneness on a clustering-based approach scheme. The experiments show that the proposed methods are superior to coverage-based TCP methods.
arXiv Detail & Related papers (2021-06-19T15:55:24Z)
Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem. Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem. We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z)
Selective Probabilistic Classifier Based on Hypothesis Testing [14.695979686066066]
We propose a simple yet effective method to deal with the violation of the Closed-World Assumption for a classifier. The proposed method is a rejection option based on hypothesis testing with probabilistic networks. It is shown that the proposed method can achieve a broader range of operation and cover a lower False Positive Ratio than the alternative.
arXiv Detail & Related papers (2021-05-09T08:55:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.