WEFix: Intelligent Automatic Generation of Explicit Waits for Efficient
Web End-to-End Flaky Tests
- URL: http://arxiv.org/abs/2402.09745v1
- Date: Thu, 15 Feb 2024 06:51:53 GMT
- Title: WEFix: Intelligent Automatic Generation of Explicit Waits for Efficient
Web End-to-End Flaky Tests
- Authors: Xinyue Liu, Zihe Song, Weike Fang, Wei Yang, Weihang Wang
- Abstract summary: We propose WEFix, a technique that can automatically generate fix code for UI-based flakiness in web e2e testing.
We evaluate the effectiveness and efficiency of WEFix against 122 web e2e flaky tests from seven popular real-world projects.
- Score: 13.280540531582945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Web end-to-end (e2e) testing evaluates the workflow of a web application. It
simulates real-world user scenarios to ensure the application flows behave as
expected. However, web e2e tests are notorious for being flaky, i.e., the tests
can produce inconsistent results despite no changes to the code. One common
type of flakiness is caused by nondeterministic execution orders between the
test code and the client-side code under test. In particular, UI-based
flakiness emerges as a notably prevalent and challenging issue to fix because
the test code has limited knowledge about the client-side code execution. In
this paper, we propose WEFix, a technique that can automatically generate fix
code for UI-based flakiness in web e2e testing. The core of our approach is to
leverage browser UI changes to predict the client-side code execution and
generate proper wait oracles. We evaluate the effectiveness and efficiency of
WEFix against 122 web e2e flaky tests from seven popular real-world projects.
Our results show that WEFix dramatically reduces the overhead (from 3.7$\times$
to 1.25$\times$) while achieving a high correctness (98%).
Related papers
- AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark [24.14654309612826]
TestGenEval comprises 68,647 tests from 1,210 code and test file pairs across 11 well-maintained Python repositories.
It covers initial tests authoring, test suite completion, and code coverage improvements.
We evaluate several popular models, with sizes ranging from 7B to 405B parameters.
arXiv Detail & Related papers (2024-10-01T14:47:05Z) - Taming Timeout Flakiness: An Empirical Study of SAP HANA [47.29324864511411]
Flaky tests negatively affect regression testing because they result in test failures that are not necessarily caused by code changes.
Test timeouts are one contributing factor to such flaky test failures.
Test flakiness rate ranges from 49% to 70%, depending on the number of repeated test executions.
arXiv Detail & Related papers (2024-02-07T20:01:41Z) - Do Automatic Test Generation Tools Generate Flaky Tests? [12.813573907094074]
The prevalence and nature of flaky tests produced by test generation tools remain largely unknown.
We generate tests using EvoSuite (Java) and Pynguin (Python) and execute each test 200 times.
Our results show that flakiness is at least as common in generated tests as in developer-written tests.
arXiv Detail & Related papers (2023-10-08T16:44:27Z) - Towards Automatic Generation of Amplified Regression Test Oracles [44.45138073080198]
We propose a test oracle derivation approach to amplify regression test oracles.
The approach monitors the object state during test execution and compares it to the previous version to detect any changes in relation to the SUT's intended behaviour.
arXiv Detail & Related papers (2023-07-28T12:38:44Z) - Neural Embeddings for Web Testing [49.66745368789056]
Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence.
We propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers.
Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately.
arXiv Detail & Related papers (2023-06-12T19:59:36Z) - Time-based Repair for Asynchronous Wait Flaky Tests in Web Testing [0.0]
Asynchronous waits are one of the most prevalent root causes of flaky tests in web applications.
We propose TRaf, an automated time-based repair method for asynchronous wait flaky tests.
Our analysis shows that TRaf can suggest a shorter wait time to resolve the test flakiness compared to developer-written fixes.
arXiv Detail & Related papers (2023-05-15T12:17:30Z) - MT3: Meta Test-Time Training for Self-Supervised Test-Time Adaption [69.76837484008033]
An unresolved problem in Deep Learning is the ability of neural networks to cope with domain shifts during test-time.
We combine meta-learning, self-supervision and test-time training to learn to adapt to unseen test distributions.
Our approach significantly improves the state-of-the-art results on the CIFAR-10-Corrupted image classification benchmark.
arXiv Detail & Related papers (2021-03-30T09:33:38Z) - Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement
Learning Framework [68.96770035057716]
A/B testing is a business strategy to compare a new product with an old one in pharmaceutical, technological, and traditional industries.
This paper introduces a reinforcement learning framework for carrying A/B testing in online experiments.
arXiv Detail & Related papers (2020-02-05T10:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.