GenIA-E2ETest: A Generative AI-Based Approach for End-to-End Test Automation
- URL: http://arxiv.org/abs/2510.01024v1
- Date: Wed, 01 Oct 2025 15:30:24 GMT
- Title: GenIA-E2ETest: A Generative AI-Based Approach for End-to-End Test Automation
- Authors: Elvis Júnior, Alan Valejo, Jorge Valverde-Rebaza, Vânia de Oliveira Neves,
- Abstract summary: This paper introduces GenIA-E2ETest, which leverages generative AI to generate E2E test scripts from natural language descriptions automatically.<n>We evaluated the approach on two web applications, assessing completeness, correctness, adaptation effort, and robustness.
- Score: 0.3499870393443268
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Software testing is essential to ensure system quality, but it remains time-consuming and error-prone when performed manually. Although recent advances in Large Language Models (LLMs) have enabled automated test generation, most existing solutions focus on unit testing and do not address the challenges of end-to-end (E2E) testing, which validates complete application workflows from user input to final system response. This paper introduces GenIA-E2ETest, which leverages generative AI to generate executable E2E test scripts from natural language descriptions automatically. We evaluated the approach on two web applications, assessing completeness, correctness, adaptation effort, and robustness. Results were encouraging: the scripts achieved an average of 77% for both element metrics, 82% for precision of execution, 85% for execution recall, required minimal manual adjustments (average manual modification rate of 10%), and showed consistent performance in typical web scenarios. Although some sensitivity to context-dependent navigation and dynamic content was observed, the findings suggest that GenIA-E2ETest is a practical and effective solution to accelerate E2E test automation from natural language, reducing manual effort and broadening access to automated testing.
Related papers
- Agentic RAG for Software Testing with Hybrid Vector-Graph and Multi-Agent Orchestration [0.0]
We present an approach to software testing automation using Agentic Retrieval-Augmented Generation (RAG) systems for Quality Engineering (QE) artifact creation.<n>We combine autonomous AI agents with hybrid vector-graph knowledge systems to automate test plan, case, and QE metric generation.
arXiv Detail & Related papers (2025-10-12T22:25:15Z) - TestAgent: An Adaptive and Intelligent Expert for Human Assessment [62.060118490577366]
We propose TestAgent, a large language model (LLM)-powered agent designed to enhance adaptive testing through interactive engagement.<n>TestAgent supports personalized question selection, captures test-takers' responses and anomalies, and provides precise outcomes through dynamic, conversational interactions.
arXiv Detail & Related papers (2025-06-03T16:07:54Z) - Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs [0.5965410190046627]
This paper presents an automated system for generating test cases for two key aspects of web application testing: site navigation and form filling.<n>For site navigation, the system employs screen transition graphs and LLMs to model navigation flows and generate test scenarios.<n>For form filling, it uses state graphs to handle conditional forms and automates Selenium script generation.
arXiv Detail & Related papers (2025-06-03T07:08:21Z) - Acceptance Test Generation with Large Language Models: An Industrial Case Study [0.7874708385247353]
Large language model (LLM)-powered assistants are increasingly used for generating program code and unit tests.<n>This paper explores the use of LLMs for generating executable acceptance tests for web applications through a two-step process.<n>This two-step approach supports acceptance test-driven development, enhances tester control, and improves test quality.
arXiv Detail & Related papers (2025-04-09T19:33:38Z) - The BrowserGym Ecosystem for Web Agent Research [151.90034093362343]
BrowserGym ecosystem addresses the growing need for efficient evaluation and benchmarking of web agents.<n>We propose an extended BrowserGym-based ecosystem for web agent research, which unifies existing benchmarks from the literature.<n>We conduct the first large-scale, multi-benchmark web agent experiment and compare the performance of 6 state-of-the-art LLMs across 6 popular web agent benchmarks.
arXiv Detail & Related papers (2024-12-06T23:43:59Z) - AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - The Future of Software Testing: AI-Powered Test Case Generation and Validation [0.0]
This paper explores the transformative potential of AI in improving test case generation and validation.<n>It focuses on its ability to enhance efficiency, accuracy, and scalability in testing processes.<n>It also addresses key challenges associated with adapting AI for testing, including the need for high quality training data.
arXiv Detail & Related papers (2024-09-09T17:12:40Z) - Feature-Driven End-To-End Test Generation [5.7340627516257525]
AutoE2E is a novel approach to automate the generation of semantically meaningful feature-driven E2E test cases for web applications.<n>E2EBench is a new benchmark for automatically assessing the feature coverage of E2E test suites.
arXiv Detail & Related papers (2024-08-04T01:16:04Z) - Better Practices for Domain Adaptation [62.70267990659201]
Domain adaptation (DA) aims to provide frameworks for adapting models to deployment data without using labels.
Unclear validation protocol for DA has led to bad practices in the literature.
We show challenges across all three branches of domain adaptation methodology.
arXiv Detail & Related papers (2023-09-07T17:44:18Z) - Neural Embeddings for Web Testing [49.66745368789056]
Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence.
We propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers.
Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately.
arXiv Detail & Related papers (2023-06-12T19:59:36Z) - Listen, Adapt, Better WER: Source-free Single-utterance Test-time
Adaptation for Automatic Speech Recognition [65.84978547406753]
Test-time Adaptation aims to adapt the model trained on source domains to yield better predictions for test samples.
Single-Utterance Test-time Adaptation (SUTA) is the first TTA study in speech area to our best knowledge.
arXiv Detail & Related papers (2022-03-27T06:38:39Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.