Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
- URL: http://arxiv.org/abs/2506.02529v1
- Date: Tue, 03 Jun 2025 07:08:21 GMT
- Title: Automated Web Application Testing: End-to-End Test Case Generation with Large Language Models and Screen Transition Graphs
- Authors: Nguyen-Khang Le, Quan Minh Bui, Minh Ngoc Nguyen, Hiep Nguyen, Trung Vo, Son T. Luu, Shoshin Nomura, Minh Le Nguyen,
- Abstract summary: This paper presents an automated system for generating test cases for two key aspects of web application testing: site navigation and form filling.<n>For site navigation, the system employs screen transition graphs and LLMs to model navigation flows and generate test scenarios.<n>For form filling, it uses state graphs to handle conditional forms and automates Selenium script generation.
- Score: 0.5965410190046627
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Web applications are critical to modern software ecosystems, yet ensuring their reliability remains challenging due to the complexity and dynamic nature of web interfaces. Recent advances in large language models (LLMs) have shown promise in automating complex tasks, but limitations persist in handling dynamic navigation flows and complex form interactions. This paper presents an automated system for generating test cases for two key aspects of web application testing: site navigation and form filling. For site navigation, the system employs screen transition graphs and LLMs to model navigation flows and generate test scenarios. For form filling, it uses state graphs to handle conditional forms and automates Selenium script generation. Key contributions include: (1) a novel integration of graph structures and LLMs for site navigation testing, (2) a state graph-based approach for automating form-filling test cases, and (3) a comprehensive dataset for evaluating form-interaction testing. Experimental results demonstrate the system's effectiveness in improving test coverage and robustness, advancing the state of web application testing.
Related papers
- MobileGUI-RL: Advancing Mobile GUI Agent through Reinforcement Learning in Online Environment [63.62778707277929]
MobileGUI-RL is a scalable framework that trains GUI agent in online environment.<n>It synthesizes a curriculum of learnable tasks through self-exploration and filtering.<n>It adapts GRPO to GUI navigation with trajectory-aware advantages and composite rewards.
arXiv Detail & Related papers (2025-07-08T07:07:53Z) - FormFactory: An Interactive Benchmarking Suite for Multimodal Form-Filling Agents [36.11725924594441]
Current online form filling tools are largely rule-based and lack generalizable, generative capabilities.<n>We propose FormFactory, an interactive benchmarking suite comprising a web-based interface, backend evaluation module, and dataset.<n>Our benchmark covers diverse real-world scenarios, incorporates various field formats, and simulates high-fidelity form interactions.
arXiv Detail & Related papers (2025-06-02T10:34:57Z) - GUI Testing Arena: A Unified Benchmark for Advancing Autonomous GUI Testing Agent [24.97846085313314]
We propose a formalized and comprehensive environment to evaluate the entire process of automated GUI Testing.<n>We divide the testing process into three key subtasks: test intention generation, test task execution, and GUI defect detection.<n>It evaluates the performance of different models using three data types: real mobile applications, mobile applications with artificially injected defects, and synthetic data.
arXiv Detail & Related papers (2024-12-24T13:41:47Z) - AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials [53.376263056033046]
Existing approaches rely on expensive human annotation, making them unsustainable at scale.<n>We propose AgentTrek, a scalable data synthesis pipeline that generates web agent trajectories by leveraging publicly available tutorials.<n>Our fully automated approach significantly reduces data collection costs, achieving a cost of just $0.55 per high-quality trajectory without human annotators.
arXiv Detail & Related papers (2024-12-12T18:59:27Z) - Leveraging Large Vision Language Model For Better Automatic Web GUI Testing [7.480576630392405]
This paper proposes VETL, the first LVLM-driven endtoend web testing technique.
With LVLM's scene understanding capabilities, VETL can generate valid and meaningful text inputs focusing on the local context.
The selection of associated GUI elements is formulated as a visual question-answering problem, allowing LVLM to capture the logical connection between the input box and the relevant element.
arXiv Detail & Related papers (2024-10-16T01:37:58Z) - Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows? [73.81908518992161]
We introduce Spider2-V, the first multimodal agent benchmark focusing on professional data science and engineering.
Spider2-V features real-world tasks in authentic computer environments and incorporating 20 enterprise-level professional applications.
These tasks evaluate the ability of a multimodal agent to perform data-related tasks by writing code and managing the GUI in enterprise data software systems.
arXiv Detail & Related papers (2024-07-15T17:54:37Z) - Automating REST API Postman Test Cases Using LLM [0.0]
This research paper is dedicated to the exploration and implementation of an automated approach to generate test cases using Large Language Models.
The methodology integrates the use of Open AI to enhance the efficiency and effectiveness of test case generation.
The model that is developed during the research is trained using manually collected postman test cases or instances for various Rest APIs.
arXiv Detail & Related papers (2024-04-16T15:53:41Z) - Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering [74.99736967448423]
We construct Design2Code - the first real-world benchmark for this task.<n>We manually curate 484 diverse real-world webpages as test cases and develop a set of automatic evaluation metrics.<n>Our fine-grained break-down metrics indicate that models mostly lag in recalling visual elements from the input webpages and generating correct layout designs.
arXiv Detail & Related papers (2024-03-05T17:56:27Z) - Semantic Constraint Inference for Web Form Test Generation [6.0759036120654315]
We introduce an innovative approach, called FormNexus, for automated web form test generation.
FormNexus emphasizes deriving semantic insights from individual form elements and relations among them.
We show that FormNexus combined with GPT-4 achieves 89% coverage in form submission states.
arXiv Detail & Related papers (2024-02-01T19:10:05Z) - Fast-Slow Test-Time Adaptation for Online Vision-and-Language Navigation [67.18144414660681]
We propose a Fast-Slow Test-Time Adaptation (FSTTA) approach for online Vision-and-Language Navigation (VLN)
Our method obtains impressive performance gains on four popular benchmarks.
arXiv Detail & Related papers (2023-11-22T07:47:39Z) - Neural Embeddings for Web Testing [49.66745368789056]
Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence.
We propose WEBEMBED, a novel abstraction function based on neural network embeddings and threshold-free classifiers.
Our evaluation on nine web apps shows that WEBEMBED outperforms state-of-the-art techniques by detecting near-duplicates more accurately.
arXiv Detail & Related papers (2023-06-12T19:59:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.