TARGET: Automated Scenario Generation from Traffic Rules for Testing Autonomous Vehicles via Validated LLM-Guided Knowledge Extraction
- URL: http://arxiv.org/abs/2305.06018v4
- Date: Thu, 15 May 2025 22:10:56 GMT
- Title: TARGET: Automated Scenario Generation from Traffic Rules for Testing Autonomous Vehicles via Validated LLM-Guided Knowledge Extraction
- Authors: Yao Deng, Jiaohong Yao, Zhi Tu, Xi Zheng, Mengshi Zhang, Tianyi Zhang,
- Abstract summary: TARGET is an end-to-end framework that automatically generates test scenarios from traffic rules.<n>We leverage a Large Language Model (LLM) to extract knowledge from traffic rules.<n>TARGET synthesizes executable scripts to render scenarios in simulation.
- Score: 8.029974249105443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent incidents with autonomous vehicles highlight the need for rigorous testing to ensure safety and robustness. Constructing test scenarios for autonomous driving systems (ADSs), however, is labor-intensive. We propose TARGET, an end-to-end framework that automatically generates test scenarios from traffic rules. To address complexity, we leverage a Large Language Model (LLM) to extract knowledge from traffic rules. To mitigate hallucinations caused by large context during input processing, we introduce a domain-specific language (DSL) designed to be syntactically simple and compositional. This design allows the LLM to learn and generate test scenarios in a modular manner while enabling syntactic and semantic validation for each component. Based on these validated representations, TARGET synthesizes executable scripts to render scenarios in simulation. Evaluated seven ADSs with 284 scenarios derived from 54 traffic rules, TARGET uncovered 610 rule violations, collisions, and other issues. For each violation, TARGET generates scenario recordings and detailed logs, aiding root cause analysis. Two identified issues were confirmed by ADS developers: one linked to an existing bug report and the other to limited ADS functionality.
Related papers
- Multi-modal Traffic Scenario Generation for Autonomous Driving System Testing [10.518062593457351]
TrafficComposer is a multi-modal traffic scenario construction approach for autonomous driving systems (ADS) testing.<n>It generates the corresponding traffic scenario in a simulator, such as CARLA and LGSVL.<n>On a benchmark of 120 traffic scenarios, TrafficComposer achieves 97.0% accuracy, outperforming the best-performing baseline by 7.3%.
arXiv Detail & Related papers (2025-05-20T20:12:08Z) - On Simulation-Guided LLM-based Code Generation for Safe Autonomous Driving Software [0.577182115743694]
Automated Driving System (ADS) is a safety-critical software system responsible for the interpretation of the vehicle's environment.<n>Development of ADS requires rigorous processes to verify, validate, assess, and qualify the code before it can be deployed in the vehicle.<n>This study developed and evaluated a prototype for automatic code generation and assessment.
arXiv Detail & Related papers (2025-04-02T21:35:11Z) - Text2Scenario: Text-Driven Scenario Generation for Autonomous Driving Test [15.601818101020996]
Text2Scenario is a framework that autonomously generates simulation test scenarios that closely align with user specifications.
Result is an efficient and precise evaluation of diverse AD stacks void of the labor-intensive need for manual scenario configuration.
arXiv Detail & Related papers (2025-03-04T07:20:25Z) - SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models [63.71984266104757]
We propose SafeAuto, a framework that enhances MLLM-based autonomous driving by incorporating both unstructured and structured knowledge.<n>To explicitly integrate safety knowledge, we develop a reasoning component that translates traffic rules into first-order logic.<n>Our Multimodal Retrieval-Augmented Generation model leverages video, control signals, and environmental attributes to learn from past driving experiences.
arXiv Detail & Related papers (2025-02-28T21:53:47Z) - From Accidents to Insights: Leveraging Multimodal Data for Scenario-Driven ADS Testing [3.984220091774453]
This paper introduces TRACE, a scenario-based ADS Test case Generation framework for Critical Scenarios.
By leveraging multimodal data to extract challenging scenarios from real-world car crash reports, TRACE constructs numerous critical test cases with less data.
User feedback reveals that TRACE demonstrates superior scenario reconstruction accuracy, with 77.5% of the scenarios being rated as'mostly or 'totally' consistent.
arXiv Detail & Related papers (2025-02-04T05:21:29Z) - Black-Box Adversarial Attack on Vision Language Models for Autonomous Driving [65.61999354218628]
We take the first step toward designing black-box adversarial attacks specifically targeting vision-language models (VLMs) in autonomous driving systems.
We propose Cascading Adversarial Disruption (CAD), which targets low-level reasoning breakdown by generating and injecting semantics.
We present Risky Scene Induction, which addresses dynamic adaptation by leveraging a surrogate VLM to understand and construct high-level risky scenarios.
arXiv Detail & Related papers (2025-01-23T11:10:02Z) - LMM-enhanced Safety-Critical Scenario Generation for Autonomous Driving System Testing From Non-Accident Traffic Videos [22.638869562921133]
It is paramount to generate a diverse range of safety-critical test scenarios for autonomous driving systems.
Some accident-free real-world scenarios can not only lead to misbehaviors in ADSs but also be leveraged for the generation of ADS violations.
It is of significant importance to discover safety violations of ADSs from routine traffic scenarios.
arXiv Detail & Related papers (2024-06-16T09:05:56Z) - Get my drift? Catching LLM Task Drift with Activation Deltas [55.75645403965326]
Task drift allows attackers to exfiltrate data or influence the LLM's output for other users.<n>We show that a simple linear classifier can detect drift with near-perfect ROC AUC on an out-of-distribution test set.<n>We observe that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions.
arXiv Detail & Related papers (2024-06-02T16:53:21Z) - REDriver: Runtime Enforcement for Autonomous Vehicles [6.97499033700151]
We propose REDriver, a general and modular approach to runtime enforcement of autonomous driving systems.
ReDriver monitors the planned trajectory of the ADS based on a quantitative semantics of STL.
It uses a gradient-driven algorithm to repair the trajectory when a violation of the specification is likely.
arXiv Detail & Related papers (2024-01-04T13:08:38Z) - SAFE-SIM: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries [94.84458417662407]
We introduce SAFE-SIM, a controllable closed-loop safety-critical simulation framework.
Our approach yields two distinct advantages: 1) generating realistic long-tail safety-critical scenarios that closely reflect real-world conditions, and 2) providing controllable adversarial behavior for more comprehensive and interactive evaluations.
We validate our framework empirically using the nuScenes and nuPlan datasets across multiple planners, demonstrating improvements in both realism and controllability.
arXiv Detail & Related papers (2023-12-31T04:14:43Z) - DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral
Planning States for Autonomous Driving [69.82743399946371]
DriveMLM is a framework that can perform close-loop autonomous driving in realistic simulators.
We employ a multi-modal LLM (MLLM) to model the behavior planning module of a module AD system.
This model can plug-and-play in existing AD systems such as Apollo for close-loop driving.
arXiv Detail & Related papers (2023-12-14T18:59:05Z) - Attacking Motion Planners Using Adversarial Perception Errors [5.423900036420565]
We show that it is possible to construct planner inputs that score very highly on various perception quality metrics but still lead to planning failures.
We demonstrate the effectiveness of this algorithm by finding attacks for two different black-box planners in several urban and highway driving scenarios.
arXiv Detail & Related papers (2023-11-21T16:51:33Z) - DARTH: Holistic Test-time Adaptation for Multiple Object Tracking [87.72019733473562]
Multiple object tracking (MOT) is a fundamental component of perception systems for autonomous driving.
Despite the urge of safety in driving systems, no solution to the MOT adaptation problem to domain shift in test-time conditions has ever been proposed.
We introduce DARTH, a holistic test-time adaptation framework for MOT.
arXiv Detail & Related papers (2023-10-03T10:10:42Z) - DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model [84.29836263441136]
This study introduces DriveGPT4, a novel interpretable end-to-end autonomous driving system based on multimodal large language models (MLLMs)
DriveGPT4 facilitates the interpretation of vehicle actions, offers pertinent reasoning, and effectively addresses a diverse range of questions posed by users.
arXiv Detail & Related papers (2023-10-02T17:59:52Z) - Clustering-based Criticality Analysis for Testing of Automated Driving
Systems [0.18416014644193066]
This paper focuses on the the goal to reduce the scenario set by clustering concrete scenarios from a single logical scenario.
By employing clustering techniques, redundant and uninteresting scenarios can be identified and eliminated, resulting in a representative scenario set.
arXiv Detail & Related papers (2023-06-22T08:36:20Z) - DeepAccident: A Motion and Accident Prediction Benchmark for V2X
Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving.
The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z) - An Application of Scenario Exploration to Find New Scenarios for the
Development and Testing of Automated Driving Systems in Urban Scenarios [2.480533141352916]
This work aims to find relevant, interesting, or critical parameter sets within logical scenarios by utilizing Bayes optimization and Gaussian processes.
A list of ideas this work leads to and should be investigated further is presented.
arXiv Detail & Related papers (2022-05-17T09:47:32Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z) - Neural Network Guided Evolutionary Fuzzing for Finding Traffic
Violations of Autonomous Vehicles [15.702721819948623]
Existing testing methods are inadequate for checking the end-to-end behaviors of autonomous vehicles.
We propose a new fuzz testing technique, called AutoFuzz, which can leverage widely-used AV simulators' API grammars.
AutoFuzz efficiently finds hundreds of realistic traffic violations resembling real-world crashes.
arXiv Detail & Related papers (2021-09-13T17:05:43Z) - Generating and Characterizing Scenarios for Safety Testing of Autonomous
Vehicles [86.9067793493874]
We propose efficient mechanisms to characterize and generate testing scenarios using a state-of-the-art driving simulator.
We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project.
We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident.
arXiv Detail & Related papers (2021-03-12T17:00:23Z) - Detecting 32 Pedestrian Attributes for Autonomous Vehicles [103.87351701138554]
In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes.
We introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.
We show competitive detection and attribute recognition results, as well as a more stable MTL training.
arXiv Detail & Related papers (2020-12-04T15:10:12Z) - Pass-Fail Criteria for Scenario-Based Testing of Automated Driving
Systems [0.0]
This paper sets out a framework for assessing an automated driving system's behavioural safety in normal operation.
Risk-based rules cannot give a pass/fail decision from a single test case.
This considers statistical performance across many individual tests.
arXiv Detail & Related papers (2020-05-19T13:13:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.