Reduction of Test Re-runs by Prioritizing Potential Order Dependent Flaky Tests
- URL: http://arxiv.org/abs/2510.26171v1
- Date: Thu, 30 Oct 2025 06:17:30 GMT
- Title: Reduction of Test Re-runs by Prioritizing Potential Order Dependent Flaky Tests
- Authors: Hasnain Iqbal, Zerina Begum, Kazi Sakib,
- Abstract summary: Flaky tests can make automated software testing unreliable due to their unpredictable behavior.<n>A common type of flaky test is the order-dependent (OD) test.<n>We propose a method to prioritize potential OD tests.
- Score: 0.5798758080057375
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Flaky tests can make automated software testing unreliable due to their unpredictable behavior. These tests can pass or fail on the same code base on multiple runs. However, flaky tests often do not refer to any fault, even though they can cause the continuous integration (CI) pipeline to fail. A common type of flaky test is the order-dependent (OD) test. The outcome of an OD test depends on the order in which it is run with respect to other test cases. Several studies have explored the detection and repair of OD tests. However, their methods require re-runs of tests multiple times, that are not related to the order dependence. Hence, prioritizing potential OD tests is necessary to reduce the re-runs. In this paper, we propose a method to prioritize potential order-dependent tests. By analyzing shared static fields in test classes, we identify tests that are more likely to be order-dependent. In our experiment on 27 project modules, our method successfully prioritized all OD tests in 23 cases, reducing test executions by an average of 65.92% and unnecessary re-runs by 72.19%. These results demonstrate that our approach significantly improves the efficiency of OD test detection by lowering execution costs.
Related papers
- JS-TOD: Detecting Order-Dependent Flaky Tests in Jest [5.178246622041266]
JS-TOD is a tool that can extract, reorder, and rerun Jest tests to reveal possible order-dependent test flakiness.<n>Test order dependency is one of the leading causes of test flakiness.
arXiv Detail & Related papers (2025-08-30T11:44:14Z) - Studying the Impact of Early Test Termination Due to Assertion Failure on Code Coverage and Spectrum-based Fault Localization [48.22524837906857]
This study is the first empirical study on early test termination due to assertion failure.<n>We investigated 207 versions of 6 open-source projects.<n>Our findings indicate that early test termination harms both code coverage and the effectiveness of spectrum-based fault localization.
arXiv Detail & Related papers (2025-04-06T17:14:09Z) - Do Test and Environmental Complexity Increase Flakiness? An Empirical Study of SAP HANA [47.29324864511411]
Flaky tests fail seemingly at random without changes to the code.
We study characteristics of tests and the test environment that potentially impact test flakiness.
arXiv Detail & Related papers (2024-09-16T07:52:09Z) - STAMP: Outlier-Aware Test-Time Adaptation with Stable Memory Replay [76.06127233986663]
Test-time adaptation (TTA) aims to address the distribution shift between the training and test data with only unlabeled data at test time.
This paper pays attention to the problem that conducts both sample recognition and outlier rejection during inference while outliers exist.
We propose a new approach called STAble Memory rePlay (STAMP), which performs optimization over a stable memory bank instead of the risky mini-batch.
arXiv Detail & Related papers (2024-07-22T16:25:41Z) - Taming Timeout Flakiness: An Empirical Study of SAP HANA [47.29324864511411]
Flaky tests negatively affect regression testing because they result in test failures that are not necessarily caused by code changes.
Test timeouts are one contributing factor to such flaky test failures.
Test flakiness rate ranges from 49% to 70%, depending on the number of repeated test executions.
arXiv Detail & Related papers (2024-02-07T20:01:41Z) - Precise Error Rates for Computationally Efficient Testing [67.30044609837749]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.<n>An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - The Effects of Computational Resources on Flaky Tests [9.694460778355925]
Flaky tests are tests that nondeterministically pass and fail in unchanged code.
Resource-Affected Flaky Tests indicate that a substantial proportion of flaky-test failures can be avoided by adjusting the resources available when running tests.
arXiv Detail & Related papers (2023-10-18T17:42:58Z) - Sequential Kernelized Independence Testing [77.237958592189]
We design sequential kernelized independence tests inspired by kernelized dependence measures.<n>We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Test2Vec: An Execution Trace Embedding for Test Case Prioritization [12.624724734296342]
Execution traces of test cases can be a good alternative to abstract their behavior for automated testing tasks.
We propose a novel embedding approach, Test2Vec, that maps test execution traces to a latent space.
Results show that our proposed TP improves best alternatives by 41.80% in terms of the median normalized rank of the first failing test case.
arXiv Detail & Related papers (2022-06-28T20:38:36Z) - DeepOrder: Deep Learning for Test Case Prioritization in Continuous
Integration Testing [6.767885381740952]
This work introduces DeepOrder, a deep learning-based model that works on the basis of regression machine learning.
DeepOrder ranks test cases based on the historical record of test executions from any number of previous test cycles.
We experimentally show that deep neural networks, as a simple regression model, can be efficiently used for test case prioritization in continuous integration testing.
arXiv Detail & Related papers (2021-10-14T15:10:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.