Related papers: Multi-modal Traffic Scenario Generation for Autonomous Driving System Testing

Multi-modal Traffic Scenario Generation for Autonomous Driving System Testing

URL: http://arxiv.org/abs/2505.14881v2
Date: Sat, 14 Jun 2025 22:53:15 GMT
Title: Multi-modal Traffic Scenario Generation for Autonomous Driving System Testing
Authors: Zhi Tu, Liangkun Niu, Wei Fan, Tianyi Zhang,
Abstract summary: TrafficComposer is a multi-modal traffic scenario construction approach for autonomous driving systems (ADS) testing.<n>It generates the corresponding traffic scenario in a simulator, such as CARLA and LGSVL.<n>On a benchmark of 120 traffic scenarios, TrafficComposer achieves 97.0% accuracy, outperforming the best-performing baseline by 7.3%.
Score: 10.518062593457351
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Autonomous driving systems (ADS) require extensive testing and validation before deployment. However, it is tedious and time-consuming to construct traffic scenarios for ADS testing. In this paper, we propose TrafficComposer, a multi-modal traffic scenario construction approach for ADS testing. TrafficComposer takes as input a natural language (NL) description of a desired traffic scenario and a complementary traffic scene image. Then, it generates the corresponding traffic scenario in a simulator, such as CARLA and LGSVL. Specifically, TrafficComposer integrates high-level dynamic information about the traffic scenario from the NL description and intricate details about the surrounding vehicles, pedestrians, and the road network from the image. The information from the two modalities is complementary to each other and helps generate high-quality traffic scenarios for ADS testing. On a benchmark of 120 traffic scenarios, TrafficComposer achieves 97.0% accuracy, outperforming the best-performing baseline by 7.3%. Both direct testing and fuzz testing experiments on six ADSs prove the bug detection capabilities of the traffic scenarios generated by TrafficComposer. These scenarios can directly discover 37 bugs and help two fuzzing methods find 33%--124% more bugs serving as initial seeds.

Related papers

TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic Representation [14.470174593447702]
Large language models (LLMs) have shown promising performance in various domains.<n>TrafficLLM introduces a dual-stage fine-tuning framework to learn generic traffic representation from raw traffic data.<n>It achieves F1-scores of 0.9875 and 0.9483, with up to 80.12% and 33.92% better performance than existing detection and generation methods.
arXiv Detail & Related papers (2025-04-05T16:18:33Z)
Towards Intelligent Transportation with Pedestrians and Vehicles In-the-Loop: A Surveillance Video-Assisted Federated Digital Twin Framework [62.47416496137193]
We propose a surveillance video assisted federated digital twin (SV-FDT) framework to empower ITSs with pedestrians and vehicles in-the-loop.<n>The architecture consists of three layers: (i) the end layer, which collects traffic surveillance videos from multiple sources; (ii) the edge layer, responsible for semantic segmentation-based visual understanding, twin agent-based interaction modeling, and local digital twin system (LDTS) creation in local regions; and (iii) the cloud layer, which integrates LDTSs across different regions to construct a global DT model in realtime.
arXiv Detail & Related papers (2025-03-06T07:36:06Z)
Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring [6.648291808015463]
This research leverages the LLaVA visual grounding multimodal large language model (LLM) for traffic monitoring tasks on the real-time Quanser Interactive Lab simulation platform.<n>Cameras placed at multiple urban locations collect real-time images from the simulation, which are fed into the LLaVA model with queries for analysis.<n>The system achieves 84.3% accuracy in recognizing vehicle locations and 76.4% in determining steering direction, outperforming traditional models.
arXiv Detail & Related papers (2025-02-16T23:03:26Z)
From Accidents to Insights: Leveraging Multimodal Data for Scenario-Driven ADS Testing [3.984220091774453]
This paper introduces TRACE, a scenario-based ADS Test case Generation framework for Critical Scenarios.<n>By leveraging multimodal data to extract challenging scenarios from real-world car crash reports, TRACE constructs numerous critical test cases with less data.<n>User feedback reveals that TRACE demonstrates superior scenario reconstruction accuracy, with 77.5% of the scenarios being rated as'mostly or 'totally' consistent.
arXiv Detail & Related papers (2025-02-04T05:21:29Z)
TrafficGPT: Towards Multi-Scale Traffic Analysis and Generation with Spatial-Temporal Agent Framework [3.947797359736224]
We have designed a multi-scale traffic generation system, TrafficGPT, using three AI agents to process multi-scale traffic data. TrafficGPT consists of three essential AI agents: 1) a text-to-demand agent to interact with users and extract prediction tasks through texts; 2) a traffic prediction agent that leverages multi-scale traffic data to generate temporal features and similarity; and 3) a suggestion and visualization agent that uses the prediction results to generate suggestions and visualizations.
arXiv Detail & Related papers (2024-05-08T07:48:40Z)
iPLAN: Intent-Aware Planning in Heterogeneous Traffic via Distributed Multi-Agent Reinforcement Learning [57.24340061741223]
We introduce a distributed multi-agent reinforcement learning (MARL) algorithm that can predict trajectories and intents in dense and heterogeneous traffic scenarios. Our approach for intent-aware planning, iPLAN, allows agents to infer nearby drivers' intents solely from their local observations.
arXiv Detail & Related papers (2023-06-09T20:12:02Z)
TARGET: Automated Scenario Generation from Traffic Rules for Testing Autonomous Vehicles via Validated LLM-Guided Knowledge Extraction [8.029974249105443]
TARGET is an end-to-end framework that automatically generates test scenarios from traffic rules.<n>We leverage a Large Language Model (LLM) to extract knowledge from traffic rules.<n>TARGET synthesizes executable scripts to render scenarios in simulation.
arXiv Detail & Related papers (2023-05-10T10:04:08Z)
OpenLane-V2: A Topology Reasoning Benchmark for Unified 3D HD Mapping [84.65114565766596]
We present OpenLane-V2, the first dataset on topology reasoning for traffic scene structure. OpenLane-V2 consists of 2,000 annotated road scenes that describe traffic elements and their correlation to the lanes. We evaluate various state-of-the-art methods, and present their quantitative and qualitative results on OpenLane-V2 to indicate future avenues for investigating topology reasoning in traffic scenes.
arXiv Detail & Related papers (2023-04-20T16:31:22Z)
DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving [76.29141888408265]
We propose a large-scale dataset containing diverse accident scenarios that frequently occur in real-world driving. The proposed DeepAccident dataset includes 57K annotated frames and 285K annotated samples, approximately 7 times more than the large-scale nuScenes dataset.
arXiv Detail & Related papers (2023-04-03T17:37:00Z)
Traffic Scene Parsing through the TSP6K Dataset [109.69836680564616]
We introduce a specialized traffic monitoring dataset, termed TSP6K, with high-quality pixel-level and instance-level annotations. The dataset captures more crowded traffic scenes with several times more traffic participants than the existing driving scenes. We propose a detail refining decoder for scene parsing, which recovers the details of different semantic regions in traffic scenes.
arXiv Detail & Related papers (2023-03-06T02:05:14Z)
End-to-End Intersection Handling using Multi-Agent Deep Reinforcement Learning [63.56464608571663]
Navigating through intersections is one of the main challenging tasks for an autonomous vehicle. In this work, we focus on the implementation of a system able to navigate through intersections where only traffic signs are provided. We propose a multi-agent system using a continuous, model-free Deep Reinforcement Learning algorithm used to train a neural network for predicting both the acceleration and the steering angle at each time step.
arXiv Detail & Related papers (2021-04-28T07:54:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.