Related papers: An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation

An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation

URL: http://arxiv.org/abs/2602.20644v1
Date: Tue, 24 Feb 2026 07:44:26 GMT
Title: An LLM-driven Scenario Generation Pipeline Using an Extended Scenic DSL for Autonomous Driving Safety Validation
Authors: Fida Khandaker Safa, Yupeng Jiang, Xi Zheng,
Abstract summary: Real-world crash reports are valuable for scenario-based testing of autonomous driving systems.<n>Current methods cannot effectively translate this multimodal data into precise, executable simulation scenarios.<n>We propose a scalable and verifiable pipeline that uses a large language model and a probabilistic intermediate representation.
Score: 4.602386383455713
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-world crash reports, which combine textual summaries and sketches, are valuable for scenario-based testing of autonomous driving systems (ADS). However, current methods cannot effectively translate this multimodal data into precise, executable simulation scenarios, hindering the scalability of ADS safety validation. In this work, we propose a scalable and verifiable pipeline that uses a large language model (GPT-4o mini) and a probabilistic intermediate representation (an Extended Scenic domain-specific language) to automatically extract semantic scenario configurations from crash reports and generate corresponding simulation-ready scenarios. Unlike earlier approaches such as ScenicNL and LCTGen (which generate scenarios directly from text) or TARGET (which uses deterministic mappings from traffic rules), our method introduces an intermediate Scenic DSL layer to separate high-level semantic understanding from low-level scenario rendering, reducing errors and capturing real-world variability. We evaluated the pipeline on cases from the NHTSA CIREN database. The results show high accuracy in knowledge extraction: 100% correctness for environmental and road network attributes, and 97% and 98% for oracle and actor trajectories, respectively, compared to human-derived ground truth. We executed the generated scenarios in the CARLA simulator using the Autoware driving stack, and they consistently triggered the intended traffic-rule violations (such as opposite-lane crossing and red-light running) across 2,000 scenario variations. These findings demonstrate that the proposed pipeline provides a legally grounded, scalable, and verifiable approach to ADS safety validation.

Related papers

Model-Based Policy Adaptation for Closed-Loop End-to-End Autonomous Driving [54.46325690390831]
We propose Model-based Policy Adaptation (MPA), a general framework that enhances the robustness and safety of pretrained E2E driving agents during deployment.<n>MPA first generates diverse counterfactual trajectories using a geometry-consistent simulation engine.<n>MPA trains a diffusion-based policy adapter to refine the base policy's predictions and a multi-step Q value model to evaluate long-term outcomes.
arXiv Detail & Related papers (2025-11-26T17:01:41Z)
LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation [102.1527101235251]
LangTraj is a language-conditioned scene-diffusion model that simulates the joint behavior of all agents in traffic scenarios.<n>By conditioning on natural language inputs, LangTraj provides flexible and intuitive control over interactive behaviors.<n>LangTraj demonstrates strong performance in realism, language controllability, and language-conditioned safety-critical simulation.
arXiv Detail & Related papers (2025-04-15T17:14:06Z)
From Words to Collisions: LLM-Guided Evaluation and Adversarial Generation of Safety-Critical Driving Scenarios [6.681744368557208]
Large Language Models (LLMs) and structured scenario parsing and prompt engineering are used to generate safety-critical driving scenarios.<n>We validate our approach using a 2D simulation framework and multiple pre-trained LLMs.<n>We conclude that an LLM equipped with domain-informed prompting techniques can effectively evaluate and generate safety-critical driving scenarios.
arXiv Detail & Related papers (2025-02-04T09:19:13Z)
Generating Out-Of-Distribution Scenarios Using Language Models [58.47597351184034]
Large Language Models (LLMs) have shown promise in autonomous driving. This paper introduces a framework for generating diverse Out-Of-Distribution (OOD) driving scenarios. We evaluate our framework through extensive simulations and introduce a new "OOD-ness" metric.
arXiv Detail & Related papers (2024-11-25T16:38:17Z)
ReGentS: Real-World Safety-Critical Driving Scenario Generation Made Stable [88.08120417169971]
Machine learning based autonomous driving systems often face challenges with safety-critical scenarios that are rare in real-world data. This work explores generating safety-critical driving scenarios by modifying complex real-world regular scenarios through trajectory optimization. Our approach addresses unrealistic diverging trajectories and unavoidable collision scenarios that are not useful for training robust planner.
arXiv Detail & Related papers (2024-09-12T08:26:33Z)
Querying Labeled Time Series Data with Scenario Programs [0.0]
We propose a formal definition of what constitutes a match between a real-world labeled time series data item and a simulated scenario. We present a definition and algorithm for matching scalable beyond the autonomous vehicles domain.
arXiv Detail & Related papers (2024-06-25T15:15:27Z)
ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles [17.396416459648755]
ChatScene is a Large Language Model (LLM)-based agent that generates safety-critical scenarios for autonomous vehicles. A key part of our agent is a comprehensive knowledge retrieval component, which efficiently translates specific textual descriptions into corresponding domain-specific code snippets.
arXiv Detail & Related papers (2024-05-22T23:21:15Z)
Causality-based Transfer of Driving Scenarios to Unseen Intersections [0.0]
In scenario-based testing automated functions are evaluated in a set of pre-defined scenarios. To create realistic scenarios, parameters and parameter dependencies have to be fitted utilizing real-world data. This paper proposes a methodology to systematically analyze relations between parameters of scenarios.
arXiv Detail & Related papers (2024-04-02T15:38:18Z)
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior [135.78858513845233]
STRIVE is a method to automatically generate challenging scenarios that cause a given planner to produce undesirable behavior, like collisions. To maintain scenario plausibility, the key idea is to leverage a learned model of traffic motion in the form of a graph-based conditional VAE. A subsequent optimization is used to find a "solution" to the scenario, ensuring it is useful to improve the given planner.
arXiv Detail & Related papers (2021-12-09T18:03:27Z)
Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles [86.9067793493874]
We propose efficient mechanisms to characterize and generate testing scenarios using a state-of-the-art driving simulator. We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project. We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident.
arXiv Detail & Related papers (2021-03-12T17:00:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.