Fine-grained Testing for Autonomous Driving Software: a Study on Autoware with LLM-driven Unit Testing
- URL: http://arxiv.org/abs/2501.09866v1
- Date: Thu, 16 Jan 2025 22:36:00 GMT
- Title: Fine-grained Testing for Autonomous Driving Software: a Study on Autoware with LLM-driven Unit Testing
- Authors: Wenhan Wang, Xuan Xie, Yuheng Huang, Renzhi Wang, An Ran Chen, Lei Ma,
- Abstract summary: We present the first study on testing, specifically unit testing, for autonomous driving systems (ADS) source code.<n>We analyze both human-written test cases and those generated by large language models (LLMs)<n>We propose AwTest-LLM, a novel approach to enhance test coverage and improve test case pass rates across Autoware packages.
- Score: 12.067489008051208
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Testing autonomous driving systems (ADS) is critical to ensuring their reliability and safety. Existing ADS testing works focuses on designing scenarios to evaluate system-level behaviors, while fine-grained testing of ADS source code has received comparatively little attention. To address this gap, we present the first study on testing, specifically unit testing, for ADS source code. Our study focuses on an industrial ADS framework, Autoware. We analyze both human-written test cases and those generated by large language models (LLMs). Our findings reveal that human-written test cases in Autoware exhibit limited test coverage, and significant challenges remain in applying LLM-generated tests for Autoware unit testing. To overcome these challenges, we propose AwTest-LLM, a novel approach to enhance test coverage and improve test case pass rates across Autoware packages.
Related papers
- Requirements-Driven Automated Software Testing: A Systematic Review [13.67495800498868]
This study synthesizes the current state of REDAST research, highlights trends, and proposes future directions.
This systematic literature review ( SLR) explores the landscape of REDAST by analyzing requirements input, transformation techniques, test outcomes, evaluation methods, and existing limitations.
arXiv Detail & Related papers (2025-02-25T23:13:09Z) - Adaptive Testing for LLM-Based Applications: A Diversity-based Approach [15.33985438101206]
We show that diversity-based testing techniques, such as Adaptive Random Testing (ART), can be effectively applied to the testing of prompt templates.
Our results, obtained using various implementations that explore several string-based distances, confirm that our approach enables the discovery of failures with reduced testing budgets.
arXiv Detail & Related papers (2025-01-23T08:53:12Z) - DriveTester: A Unified Platform for Simulation-Based Autonomous Driving Testing [24.222344794923558]
DriveTester is a unified simulation-based testing platform built on Apollo.<n>It provides a consistent and reliable environment, integrates a lightweight traffic simulator, and incorporates various state-of-the-art ADS testing techniques.
arXiv Detail & Related papers (2024-12-17T08:24:05Z) - Automated Soap Opera Testing Directed by LLMs and Scenario Knowledge: Feasibility, Challenges, and Road Ahead [43.15092098658384]
Exploratory testing (ET) harnesses tester's knowledge, creativity, and experience to create varying tests that uncover unexpected bugs from the end-user's perspective.
We explore the feasibility, challenges and road ahead of automated scenario-based ET (a.k.a soap opera testing)
arXiv Detail & Related papers (2024-12-11T17:57:23Z) - AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs.
Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z) - Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models [49.06068319380296]
We introduce context-aware testing (CAT) which uses context as an inductive bias to guide the search for meaningful model failures.
We instantiate the first CAT system, SMART Testing, which employs large language models to hypothesize relevant and likely failures.
arXiv Detail & Related papers (2024-10-31T15:06:16Z) - A System for Automated Unit Test Generation Using Large Language Models and Assessment of Generated Test Suites [1.4563527353943984]
Large Language Models (LLMs) have been applied to various aspects of software development.
We present AgoneTest: an automated system for generating test suites for Java projects.
arXiv Detail & Related papers (2024-08-14T23:02:16Z) - TestART: Improving LLM-based Unit Testing via Co-evolution of Automated Generation and Repair Iteration [7.833381226332574]
Large language models (LLMs) have demonstrated remarkable capabilities in generating unit test cases.<n>We propose TestART, a novel unit test generation method.<n>TestART improves LLM-based unit testing via co-evolution of automated generation and repair iteration.
arXiv Detail & Related papers (2024-08-06T10:52:41Z) - Test Oracle Automation in the era of LLMs [52.69509240442899]
Large Language Models (LLMs) have demonstrated remarkable proficiency in tackling diverse software testing tasks.
This paper aims to enable discussions on the potential of using LLMs for test oracle automation, along with the challenges that may emerge during the generation of various types of oracles.
arXiv Detail & Related papers (2024-05-21T13:19:10Z) - Evaluating the Impact of Flaky Simulators on Testing Autonomous Driving
Systems [2.291478393584594]
We investigate test flakiness in simulation-based testing of Autonomous Driving Systems (ADS)
We show that test flakiness in ADS is a common occurrence and can significantly impact the test results obtained by randomized algorithms.
Our machine learning (ML) classifiers effectively identify flaky ADS tests using only a single test run.
arXiv Detail & Related papers (2023-11-30T18:08:02Z) - Towards Automatic Generation of Amplified Regression Test Oracles [44.45138073080198]
We propose a test oracle derivation approach to amplify regression test oracles.
The approach monitors the object state during test execution and compares it to the previous version to detect any changes in relation to the SUT's intended behaviour.
arXiv Detail & Related papers (2023-07-28T12:38:44Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.