Debugging Flaky Tests using Spectrum-based Fault Localization
- URL: http://arxiv.org/abs/2305.04735v1
- Date: Mon, 8 May 2023 14:40:05 GMT
- Title: Debugging Flaky Tests using Spectrum-based Fault Localization
- Authors: Martin Gruber, Gordon Fraser
- Abstract summary: Flaky tests hamper regression testing as they destroy trust and waste computational and human resources.
We introduce SFFL (Spectrum-based Flaky Fault localization), an extension of traditional coverage-based SFL.
An evaluation on 101 flaky tests taken from 48 open-source Python projects demonstrates that SFFL is effective.
- Score: 14.609208863749831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Non-deterministically behaving (i.e., flaky) tests hamper regression testing
as they destroy trust and waste computational and human resources. Eradicating
flakiness in test suites is therefore an important goal, but automated
debugging tools are needed to support developers when trying to understand the
causes of flakiness. A popular example for an automated approach to support
regular debugging is spectrum-based fault localization (SFL), a technique that
identifies software components that are most likely the causes of failures.
While it is possible to also apply SFL for locating likely sources of flakiness
in code, unfortunately the flakiness makes SFL both imprecise and
non-deterministic. In this paper we introduce SFFL (Spectrum-based Flaky Fault
Localization), an extension of traditional coverage-based SFL that exploits our
observation that 80% of flaky tests exhibit varying coverage behavior between
different runs. By distinguishing between stable and flaky coverage, SFFL is
able to locate the sources of flakiness more precisely and keeps the
localization itself deterministic. An evaluation on 101 flaky tests taken from
48 open-source Python projects demonstrates that SFFL is effective: Of five
prominent SFL formulas, DStar, Ochiai, and Op2 yield the best overall
performance. On average, they are able to narrow down the fault's location to
3.5 % of the project's code base, which is 18.7 % better than traditional SFL
(for DStar). SFFL's effectiveness, however, depends on the root causes of
flakiness: The source of non-order-dependent flaky tests can be located far
more precisely than order-dependent faults.
Related papers
- FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion [48.90879664138855]
One-shot Federated Learning (OFL) significantly reduces communication costs in FL by aggregating trained models only once.
However, the performance of advanced OFL methods is far behind the normal FL.
We propose a novel learning approach to endow OFL with superb performance and low communication and storage costs, termed as FuseFL.
arXiv Detail & Related papers (2024-10-27T09:07:10Z) - Leveraging Stack Traces for Spectrum-based Fault Localization in the Absence of Failing Tests [44.13331329339185]
We introduce a new approach, SBEST, that integrates stack trace data with test coverage to enhance fault localization.
Our approach shows a significant improvement, increasing Mean Average Precision (MAP) by 32.22% and Mean Reciprocal Rank (MRR) by 17.43% over traditional stack trace ranking methods.
arXiv Detail & Related papers (2024-05-01T15:15:52Z) - A Generic Approach to Fix Test Flakiness in Real-World Projects [7.122378689356857]
FlakyDoctor is a neuro-symbolic technique that combines the power of LLMs-generalizability-and program analysis-soundness to fix different types of test flakiness.
Comparing to three alternative flakiness repair approaches, FlakyDoctor can repair 8% more ID tests than DexFix, 12% more OD flaky tests than OD, and 17% more OD flaky tests than iFixFlakies.
arXiv Detail & Related papers (2024-04-15T01:07:57Z) - A Quantitative and Qualitative Evaluation of LLM-Based Explainable Fault Localization [12.80414941523501]
AutoFL generates an explanation of the bug along with a suggested fault location.
Experiments on 798 real-world bugs in Java and Python reveal AutoFL improves method-level acc@1 by up to 233.3% over baselines.
arXiv Detail & Related papers (2023-08-10T10:26:55Z) - Improving Spectrum-Based Localization of Multiple Faults by Iterative
Test Suite Reduction [0.30458514384586394]
We present FLITSR, a novel SBFL extension that improves the localization of a given base metric in the presence of multiple faults.
For all three spectrum types we consistently see substantial reductions of the average wasted efforts at different fault levels, of 30%-90% over the best base metric.
For the method-level real faults, FLITSR also substantially outperforms GRACE, a state-of-the-art learning-based fault localizer.
arXiv Detail & Related papers (2023-06-16T15:00:40Z) - Revisiting Personalized Federated Learning: Robustness Against Backdoor
Attacks [53.81129518924231]
We conduct the first study of backdoor attacks in the pFL framework.
We show that pFL methods with partial model-sharing can significantly boost robustness against backdoor attacks.
We propose a lightweight defense method, Simple-Tuning, which empirically improves defense performance against backdoor attacks.
arXiv Detail & Related papers (2023-02-03T11:58:14Z) - Rethinking Normalization Methods in Federated Learning [92.25845185724424]
Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data.
We show that external covariate shifts will lead to the obliteration of some devices' contributions to the global model.
arXiv Detail & Related papers (2022-10-07T01:32:24Z) - DistFL: Distribution-aware Federated Learning for Mobile Scenarios [14.638070213182655]
Federated learning (FL) has emerged as an effective solution to decentralized and privacy-preserving machine learning for mobile clients.
We propose textbfDistFL, a novel framework to achieve automated and accurate textbfDistrib-aware textbfFederated textbfLution.
arXiv Detail & Related papers (2021-10-22T06:58:48Z) - Improving Semi-supervised Federated Learning by Reducing the Gradient
Diversity of Models [67.66144604972052]
Federated learning (FL) is a promising way to use the computing power of mobile devices while maintaining privacy of users.
We show that a critical issue that affects the test accuracy is the large gradient diversity of the models from different users.
We propose a novel grouping-based model averaging method to replace the FedAvg averaging method.
arXiv Detail & Related papers (2020-08-26T03:36:07Z) - Generalized Focal Loss: Learning Qualified and Distributed Bounding
Boxes for Dense Object Detection [85.53263670166304]
One-stage detector basically formulates object detection as dense classification and localization.
Recent trend for one-stage detectors is to introduce an individual prediction branch to estimate the quality of localization.
This paper delves into the representations of the above three fundamental elements: quality estimation, classification and localization.
arXiv Detail & Related papers (2020-06-08T07:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.