Learning Test-Mutant Relationship for Accurate Fault Localisation
- URL: http://arxiv.org/abs/2306.02319v1
- Date: Sun, 4 Jun 2023 10:09:38 GMT
- Title: Learning Test-Mutant Relationship for Accurate Fault Localisation
- Authors: Jinhan Kim, Gabin An, Robert Feldt, Shin Yoo
- Abstract summary: Automated fault localisation aims to assist developers in identifying the root cause of the fault by narrowing down the space of likely fault locations.
Several Mutation Based Fault Localisation (MBFL) techniques have been proposed to automatically locate faults.
Despite their success, existing MBFL techniques suffer from the cost of performing mutation analysis after the fault is observed.
This paper proposes a new MBFL technique called SIMFL, which exploits ahead-of-time mutation analysis to localise current faults.
- Score: 16.080629795085322
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Context: Automated fault localisation aims to assist developers in the task
of identifying the root cause of the fault by narrowing down the space of
likely fault locations. Simulating variants of the faulty program called
mutants, several Mutation Based Fault Localisation (MBFL) techniques have been
proposed to automatically locate faults. Despite their success, existing MBFL
techniques suffer from the cost of performing mutation analysis after the fault
is observed. Method: To overcome this shortcoming, we propose a new MBFL
technique named SIMFL (Statistical Inference for Mutation-based Fault
Localisation). SIMFL localises faults based on the past results of mutation
analysis that has been done on the earlier version in the project history,
allowing developers to make predictions on the location of incoming faults in a
just-in-time manner. Using several statistical inference methods, SIMFL models
the relationship between test results of the mutants and their locations, and
subsequently infers the location of the current faults. Results: The empirical
study on Defects4J dataset shows that SIMFL can localise 113 faults on the
first rank out of 224 faults, outperforming other MBFL techniques. Even when
SIMFL is trained on the predicted kill matrix, SIMFL can still localise 95
faults on the first rank out of 194 faults. Moreover, removing redundant
mutants significantly improves the localisation accuracy of SIMFL by the number
of faults localised at the first rank up to 51. Conclusion: This paper proposes
a new MBFL technique called SIMFL, which exploits ahead-of-time mutation
analysis to localise current faults. SIMFL is not only cost-effective, as it
does not need a mutation analysis after the fault is observed, but also capable
of localising faults accurately.
Related papers
- Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization [5.7821087202452]
This study investigates the step-by-step reasoning for explainable fault localization.
We created a dataset of faulty code files, along with explanations for 600 faulty lines.
We found that for 22 out of the 30 randomly sampled cases, FuseFL generated correct explanations.
arXiv Detail & Related papers (2024-03-15T17:47:20Z) - Large Language Models for Test-Free Fault Localization [11.080712737595174]
We propose a language model based fault localization approach that locates buggy lines of code without any test coverage information.
We fine-tune language models with 350 million, 6 billion, and 16 billion parameters on small, manually curated corpora of buggy programs.
Our empirical evaluation shows that LLMAO improves the Top-1 results over the state-of-the-art machine learning fault localization (MLFL) baselines by 2.3%-54.4%, and Top-5 results by 14.4%-35.6%.
arXiv Detail & Related papers (2023-10-03T01:26:39Z) - Large Language Models in Fault Localisation [32.87044163543427]
This paper investigates the capability of ChatGPT-3.5 and ChatGPT-4, the two state-of-the-art LLMs, on fault localisation.
Within function-level context, ChatGPT-4 outperforms all the existing fault localisation methods.
However, when the code context of the Defects4J dataset expands to the class-level, ChatGPT-4's performance suffers a significant drop.
arXiv Detail & Related papers (2023-08-29T13:07:27Z) - Improving Spectrum-Based Localization of Multiple Faults by Iterative
Test Suite Reduction [0.30458514384586394]
We present FLITSR, a novel SBFL extension that improves the localization of a given base metric in the presence of multiple faults.
For all three spectrum types we consistently see substantial reductions of the average wasted efforts at different fault levels, of 30%-90% over the best base metric.
For the method-level real faults, FLITSR also substantially outperforms GRACE, a state-of-the-art learning-based fault localizer.
arXiv Detail & Related papers (2023-06-16T15:00:40Z) - Understanding How Consistency Works in Federated Learning via Stage-wise
Relaxed Initialization [84.42306265220274]
Federated learning (FL) is a distributed paradigm that coordinates massive local clients to collaboratively train a global model.
Previous works have implicitly studied that FL suffers from the client-drift'' problem, which is caused by the inconsistent optimum across local clients.
To alleviate the negative impact of the client drift'' and explore its substance in FL, we first design an efficient FL algorithm textitFedInit.
arXiv Detail & Related papers (2023-06-09T06:55:15Z) - Adaptive Self-supervision Algorithms for Physics-informed Neural
Networks [59.822151945132525]
Physics-informed neural networks (PINNs) incorporate physical knowledge from the problem domain as a soft constraint on the loss function.
We study the impact of the location of the collocation points on the trainability of these models.
We propose a novel adaptive collocation scheme which progressively allocates more collocation points to areas where the model is making higher errors.
arXiv Detail & Related papers (2022-07-08T18:17:06Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese
Grammatical Error Correction [49.25830718574892]
We present a new framework named Tail-to-Tail (textbfTtT) non-autoregressive sequence prediction.
Considering that most tokens are correct and can be conveyed directly from source to target, and the error positions can be estimated and corrected.
Experimental results on standard datasets, especially on the variable-length datasets, demonstrate the effectiveness of TtT in terms of sentence-level Accuracy, Precision, Recall, and F1-Measure.
arXiv Detail & Related papers (2021-06-03T05:56:57Z) - Bayesian Federated Learning over Wireless Networks [87.37301441859925]
Federated learning is a privacy-preserving and distributed training method using heterogeneous data sets stored at local devices.
This paper presents an efficient modified BFL algorithm called scalableBFL (SBFL)
arXiv Detail & Related papers (2020-12-31T07:32:44Z) - Delay Minimization for Federated Learning Over Wireless Communication
Networks [172.42768672943365]
The problem of delay computation for federated learning (FL) over wireless communication networks is investigated.
A bisection search algorithm is proposed to obtain the optimal solution.
Simulation results show that the proposed algorithm can reduce delay by up to 27.3% compared to conventional FL methods.
arXiv Detail & Related papers (2020-07-05T19:00:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.