Related papers: GenAI-based test case generation and execution in SDV platform

GenAI-based test case generation and execution in SDV platform

URL: http://arxiv.org/abs/2509.05112v1
Date: Fri, 05 Sep 2025 13:50:26 GMT
Title: GenAI-based test case generation and execution in SDV platform
Authors: Denesa Zyberaj, Lukasz Mazur, Nenad Petrovic, Pankhuri Verma, Pascal Hirmer, Dirk Slama, Xiangwei Cheng, Alois Knoll,
Abstract summary: This paper introduces a GenAI-driven approach for automated test case generation.<n>We leverage Large Language Models and Vision-Language Models to translate natural language requirements and system diagrams into structured Gherkin test cases.<n>The methodology integrates Vehicle Signal Specification modeling to standardize vehicle signal definitions.
Score: 21.748869011323134
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a GenAI-driven approach for automated test case generation, leveraging Large Language Models and Vision-Language Models to translate natural language requirements and system diagrams into structured Gherkin test cases. The methodology integrates Vehicle Signal Specification modeling to standardize vehicle signal definitions, improve compatibility across automotive subsystems, and streamline integration with third-party testing tools. Generated test cases are executed within the digital.auto playground, an open and vendor-neutral environment designed to facilitate rapid validation of software-defined vehicle functionalities. We evaluate our approach using the Child Presence Detection System use case, demonstrating substantial reductions in manual test specification effort and rapid execution of generated tests. Despite significant automation, the generation of test cases and test scripts still requires manual intervention due to current limitations in the GenAI pipeline and constraints of the digital.auto platform.

Related papers

Req2Road: A GenAI Pipeline for SDV Test Artifact Generation and On-Vehicle Execution [24.305511228249486]
Large Language Models and Vision-Language Models are used to extract signals and behavioral logic.<n>The pipeline uses retrieval-augmented generation to preselect candidate VSS signals before mapping.<n>This paper is a feasibility and architectural demonstration of an end-to-end requirements-to-test pipeline for SDV subsystems.
arXiv Detail & Related papers (2026-02-17T14:03:35Z)
GenAI for Automotive Software Development: From Requirements to Wheels [3.2821049498759094]
This paper introduces a GenAI-empowered approach to automated development of automotive software.<n>The process starts with requirements as input, while the main generated outputs are test scenario code for simulation environment.<n>Our approach aims shorter compliance and re-engineering cycles, as well as reduced development and testing time when it comes to ADAS-related capabilities.
arXiv Detail & Related papers (2025-07-24T09:17:13Z)
AutoTestForge: A Multidimensional Automated Testing Framework for Natural Language Processing Models [11.958545255487735]
We introduce AutoTestForge, an automated and multidimensional testing framework for NLP models.<n>Within AutoTestForge, through the utilization of Large Language Models (LLMs) to automatically generate test templates and instantiate them, manual involvement is significantly reduced.<n>The framework also extends the test suite across three dimensions, taxonomy, fairness, and robustness, offering a comprehensive evaluation of the capabilities of NLP models.
arXiv Detail & Related papers (2025-03-07T02:44:17Z)
Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models [49.06068319380296]
We introduce context-aware testing (CAT) which uses context as an inductive bias to guide the search for meaningful model failures. We instantiate the first CAT system, SMART Testing, which employs large language models to hypothesize relevant and likely failures.
arXiv Detail & Related papers (2024-10-31T15:06:16Z)
Automatic Generation of Behavioral Test Cases For Natural Language Processing Using Clustering and Prompting [6.938766764201549]
This paper introduces an automated approach to develop test cases by exploiting the power of large language models and statistical techniques. We analyze the behavioral test profiles across four different classification algorithms and discuss the limitations and strengths of those models.
arXiv Detail & Related papers (2024-07-31T21:12:21Z)
Automated Text Scoring in the Age of Generative AI for the GPU-poor [49.1574468325115]
We analyze the performance and efficiency of open-source, small-scale generative language models for automated text scoring. Results show that GLMs can be fine-tuned to achieve adequate, though not state-of-the-art, performance.
arXiv Detail & Related papers (2024-07-02T01:17:01Z)
AutoSurvey: Large Language Models Can Automatically Write Surveys [77.0458309675818]
This paper introduces AutoSurvey, a speedy and well-organized methodology for automating the creation of comprehensive literature surveys. Traditional survey paper creation faces challenges due to the vast volume and complexity of information. Our contributions include a comprehensive solution to the survey problem, a reliable evaluation method, and experimental validation demonstrating AutoSurvey's effectiveness.
arXiv Detail & Related papers (2024-06-10T12:56:06Z)
AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios. We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z)
Towards General Error Diagnosis via Behavioral Testing in Machine Translation [48.108393938462974]
This paper proposes a new framework for conducting behavioral testing of machine translation (MT) systems. The core idea of BTPGBT is to employ a novel bilingual translation pair generation approach. Experimental results on various MT systems demonstrate that BTPGBT could provide comprehensive and accurate behavioral testing results.
arXiv Detail & Related papers (2023-10-20T09:06:41Z)
SilGAN: Generating driving maneuvers for scenario-based software-in-the-loop testing [0.0]
SilGAN is a deep generative model that eases specification, stimulus generation, and automation of automotive software-in-the-loop testing. The model is trained using data recorded from vehicles in the field.
arXiv Detail & Related papers (2021-07-05T07:17:49Z)
AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts [46.03503882865222]
AutoPrompt is an automated method to create prompts for a diverse set of tasks based on a gradient-guided search. We show that masked language models (MLMs) have an inherent capability to perform sentiment analysis and natural language inference without additional parameters or finetuning.
arXiv Detail & Related papers (2020-10-29T22:54:00Z)
A Novel Anomaly Detection Algorithm for Hybrid Production Systems based on Deep Learning and Timed Automata [73.38551379469533]
DAD:DeepAnomalyDetection is a new approach for automatic model learning and anomaly detection in hybrid production systems. It combines deep learning and timed automata for creating behavioral model from observations. The algorithm has been applied to few data sets including two from real systems and has shown promising results.
arXiv Detail & Related papers (2020-10-29T08:27:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.