Related papers: A Tool for Test Case Scenarios Generation Using Large Language Models

A Tool for Test Case Scenarios Generation Using Large Language Models

URL: http://arxiv.org/abs/2406.07021v1
Date: Tue, 11 Jun 2024 07:26:13 GMT
Title: A Tool for Test Case Scenarios Generation Using Large Language Models
Authors: Abdul Malik Sami, Zeeshan Rasheed, Muhammad Waseem, Zheying Zhang, Herda Tomas, Pekka Abrahamsson,
Abstract summary: This article centers on generating user requirements as epics and high-level user stories. It introduces a web-based software tool that employs an LLM-based agent and prompt engineering to automate the generation of test case scenarios.
Score: 3.9422957660677476
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) are widely used in Software Engineering (SE) for various tasks, including generating code, designing and documenting software, adding code comments, reviewing code, and writing test scripts. However, creating test scripts or automating test cases demands test suite documentation that comprehensively covers functional requirements. Such documentation must enable thorough testing within a constrained scope and timeframe, particularly as requirements and user demands evolve. This article centers on generating user requirements as epics and high-level user stories and crafting test case scenarios based on these stories. It introduces a web-based software tool that employs an LLM-based agent and prompt engineering to automate the generation of test case scenarios against user requirements.

Related papers

Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test [15.601818101020996]
Text2Scenario is a framework that autonomously generates simulation test scenarios that closely align with user specifications. Result is an efficient and precise evaluation of diverse AD stacks void of the labor-intensive need for manual scenario configuration.
arXiv Detail & Related papers (2025-03-04T07:20:25Z)
CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification [71.34070740261072]
This paper presents a benchmark, CLOVER, to evaluate models' capabilities in generating and completing test cases. The benchmark is containerized for code execution across tasks, and we will release the code, data, and construction methodologies.
arXiv Detail & Related papers (2025-02-12T21:42:56Z)
Commit0: Library Generation from Scratch [77.38414688148006]
Commit0 is a benchmark that challenges AI agents to write libraries from scratch. Agents are provided with a specification document outlining the library's API as well as a suite of interactive unit tests. Commit0 also offers an interactive environment where models receive static analysis and execution feedback on the code they generate.
arXiv Detail & Related papers (2024-12-02T18:11:30Z)
CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Through Corpus Retrieval and Augmentation [51.2289822267563]
We propose Corpus Retrieval and Augmentation for Fine-Tuning (CRAFT), a method for generating synthetic datasets. We use large-scale public web-crawled corpora and similarity-based document retrieval to find other relevant human-written documents. We demonstrate that CRAFT can efficiently generate large-scale task-specific training datasets for four diverse tasks.
arXiv Detail & Related papers (2024-09-03T17:54:40Z)
Automatic benchmarking of large multimodal models via iterative experiment programming [71.78089106671581]
We present APEx, the first framework for automatic benchmarking of LMMs. Given a research question expressed in natural language, APEx leverages a large language model (LLM) and a library of pre-specified tools to generate a set of experiments for the model at hand. The report drives the testing procedure: based on the current status of the investigation, APEx chooses which experiments to perform and whether the results are sufficient to draw conclusions.
arXiv Detail & Related papers (2024-06-18T06:43:46Z)
Automated Control Logic Test Case Generation using Large Language Models [13.273872261029608]
We propose a novel approach for the automatic generation of PLC test cases that queries a Large Language Model (LLM) Experiments with ten open-source function blocks from the OSCAT automation library showed that the approach is fast, easy to use, and can yield test cases with high statement coverage for low-to-medium complex programs.
arXiv Detail & Related papers (2024-05-03T06:09:21Z)
Generating Test Scenarios from NL Requirements using Retrieval-Augmented LLMs: An Industrial Study [5.179738379203527]
This paper presents an automated approach (RAGTAG) for test scenario generation using Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs) We evaluate RAGTAG on two industrial projects from Austrian Post with bilingual requirements in German and English.
arXiv Detail & Related papers (2024-04-19T10:27:40Z)
Automated User Story Generation with Test Case Specification Using Large Language Model [0.0]
We developed a tool "GeneUS" to automatically create user stories from requirements documents. The output is provided in format leaving the possibilities open for downstream integration to the popular project management tools.
arXiv Detail & Related papers (2024-04-02T01:45:57Z)
Are We Testing or Being Tested? Exploring the Practical Applications of Large Language Models in Software Testing [0.0]
A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content. LLM can play a pivotal role in software development, including software testing. This study explores the practical application of LLMs in software testing within an industrial setting.
arXiv Detail & Related papers (2023-12-08T06:30:37Z)
Eliciting Human Preferences with Language Models [56.68637202313052]
Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts. We propose to use *LMs themselves* to guide the task specification process. We study GATE in three domains: email validation, content recommendation, and moral reasoning.
arXiv Detail & Related papers (2023-10-17T21:11:21Z)
LLM for Test Script Generation and Migration: Challenges, Capabilities, and Opportunities [8.504639288314063]
Test script generation is a vital component of software testing, enabling efficient and reliable automation of repetitive test tasks. Existing generation approaches often encounter limitations, such as difficulties in accurately capturing and reproducing test scripts across diverse devices, platforms, and applications. This paper investigates the application of large language models (LLM) in the domain of mobile application test script generation.
arXiv Detail & Related papers (2023-09-24T07:58:57Z)
FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios [87.12753459582116]
A wider range of tasks now face an increasing risk of containing factual errors when handled by generative models. We propose FacTool, a task and domain agnostic framework for detecting factual errors of texts generated by large language models.
arXiv Detail & Related papers (2023-07-25T14:20:51Z)
ADAPQUEST: A Software for Web-Based Adaptive Questionnaires based on Bayesian Networks [70.79136608657296]
ADAPQUEST is a software tool written in Java for the development of adaptive questionnaires based on Bayesian networks. It embeds dedicated elicitation strategies to simplify the elicitation of the questionnaire parameters. An application of this tool for the diagnosis of mental disorders is also discussed.
arXiv Detail & Related papers (2021-12-29T09:50:44Z)
Realistic simulation of users for IT systems in cyber ranges [63.20765930558542]
We instrument each machine by means of an external agent to generate user activity. This agent combines both deterministic and deep learning based methods to adapt to different environment. We also propose conditional text generation models to facilitate the creation of conversations and documents.
arXiv Detail & Related papers (2021-11-23T10:53:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.