The Effects of Computational Resources on Flaky Tests
- URL: http://arxiv.org/abs/2310.12132v1
- Date: Wed, 18 Oct 2023 17:42:58 GMT
- Title: The Effects of Computational Resources on Flaky Tests
- Authors: Denini Silva, Martin Gruber, Satyajit Gokhale, Ellen Arteca, Alexi
Turcotte, Marcelo d'Amorim, Wing Lam, Stefan Winter, and Jonathan Bell
- Abstract summary: Flaky tests are tests that nondeterministically pass and fail in unchanged code.
Resource-Affected Flaky Tests indicate that a substantial proportion of flaky-test failures can be avoided by adjusting the resources available when running tests.
- Score: 9.694460778355925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Flaky tests are tests that nondeterministically pass and fail in unchanged
code. These tests can be detrimental to developers' productivity. Particularly
when tests run in continuous integration environments, the tests may be
competing for access to limited computational resources (CPUs, memory etc.),
and we hypothesize that resource (in)availability may be a significant factor
in the failure rate of flaky tests. We present the first assessment of the
impact that computational resources have on flaky tests, including a total of
52 projects written in Java, JavaScript and Python, and 27 different resource
configurations. Using a rigorous statistical methodology, we determine which
tests are RAFT (Resource-Affected Flaky Tests). We find that 46.5% of the flaky
tests in our dataset are RAFT, indicating that a substantial proportion of
flaky-test failures can be avoided by adjusting the resources available when
running tests. We report RAFTs and configurations to avoid them to developers,
and received interest to either fix the RAFTs or to improve the specifications
of the projects so that tests would be run only in configurations that are
unlikely to encounter RAFT failures. Our results also have implications for
researchers attempting to detect flaky tests, e.g., reducing the resources
available when running tests is a cost-effective approach to detect more flaky
failures.
Related papers
- Do Test and Environmental Complexity Increase Flakiness? An Empirical Study of SAP HANA [47.29324864511411]
Flaky tests fail seemingly at random without changes to the code.
We study characteristics of tests and the test environment that potentially impact test flakiness.
arXiv Detail & Related papers (2024-09-16T07:52:09Z) - Taming Timeout Flakiness: An Empirical Study of SAP HANA [47.29324864511411]
Flaky tests negatively affect regression testing because they result in test failures that are not necessarily caused by code changes.
Test timeouts are one contributing factor to such flaky test failures.
Test flakiness rate ranges from 49% to 70%, depending on the number of repeated test executions.
arXiv Detail & Related papers (2024-02-07T20:01:41Z) - 230,439 Test Failures Later: An Empirical Evaluation of Flaky Failure
Classifiers [9.45325012281881]
Flaky tests are tests that can non-deterministically pass or fail, even in the absence of code changes.
How to quickly determine if a test failed due to flakiness, or if it detected a bug?
arXiv Detail & Related papers (2024-01-28T22:36:30Z) - Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.
An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Do Automatic Test Generation Tools Generate Flaky Tests? [12.813573907094074]
The prevalence and nature of flaky tests produced by test generation tools remain largely unknown.
We generate tests using EvoSuite (Java) and Pynguin (Python) and execute each test 200 times.
Our results show that flakiness is at least as common in generated tests as in developer-written tests.
arXiv Detail & Related papers (2023-10-08T16:44:27Z) - FlaPy: Mining Flaky Python Tests at Scale [14.609208863749831]
FlaPy is a framework for researchers to mine flaky tests in a given or automatically sampled set of Python projects by rerunning their test suites.
FlaPy isolates the test executions using containerization and fresh execution environments to simulate real-world CI conditions.
FlaPy supports parallelizing the test executions using SLURM, making it feasible to scan thousands of projects for test flakiness.
arXiv Detail & Related papers (2023-05-08T15:48:57Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z) - Noisy Adaptive Group Testing using Bayesian Sequential Experimental
Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually.
Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.