Behavioral Safety Assessment towards Large-scale Deployment of Autonomous Vehicles
- URL: http://arxiv.org/abs/2505.16214v2
- Date: Fri, 30 May 2025 04:11:11 GMT
- Title: Behavioral Safety Assessment towards Large-scale Deployment of Autonomous Vehicles
- Authors: Henry X. Liu, Xintao Yan, Haowei Sun, Tinghan Wang, Zhijie Qiao, Haojie Zhu, Shengyin Shen, Shuo Feng, Greg Stevens, Greg McGuire,
- Abstract summary: We propose a paradigm shift toward behavioral safety for autonomous vehicles (AVs)<n>We introduce a third-party AV safety assessment framework comprising two complementary evaluation components: Driver Licensing Test and Driving Intelligence Test.<n>We validated our proposed framework using textttAutoware.Universe, an open-source Level 4 AV, tested both in simulated environments and on the physical test track at the University of Michigan's Mcity Testing Facility.
- Score: 6.846750893175613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous vehicles (AVs) have significantly advanced in real-world deployment in recent years, yet safety continues to be a critical barrier to widespread adoption. Traditional functional safety approaches, which primarily verify the reliability, robustness, and adequacy of AV hardware and software systems from a vehicle-centric perspective, do not sufficiently address the AV's broader interactions and behavioral impact on the surrounding traffic environment. To overcome this limitation, we propose a paradigm shift toward behavioral safety, a comprehensive approach focused on evaluating AV responses and interactions within traffic environment. To systematically assess behavioral safety, we introduce a third-party AV safety assessment framework comprising two complementary evaluation components: Driver Licensing Test and Driving Intelligence Test. The Driver Licensing Test evaluates AV's reactive behaviors under controlled scenarios, ensuring basic behavioral competency. In contrast, the Driving Intelligence Test assesses AV's interactive behaviors within naturalistic traffic conditions, quantifying the frequency of safety-critical events to deliver statistically meaningful safety metrics before large-scale deployment. We validated our proposed framework using \texttt{Autoware.Universe}, an open-source Level 4 AV, tested both in simulated environments and on the physical test track at the University of Michigan's Mcity Testing Facility. The results indicate that \texttt{Autoware.Universe} passed 6 out of 14 scenarios and exhibited a crash rate of 3.01e-3 crashes per mile, approximately 1,000 times higher than average human driver crash rate. During the tests, we also uncovered several unknown unsafe scenarios for \texttt{Autoware.Universe}. These findings underscore the necessity of behavioral safety evaluations for improving AV safety performance prior to widespread public deployment.
Related papers
- Test Automation for Interactive Scenarios via Promptable Traffic Simulation [48.240394447516664]
We introduce an automated method to generate realistic and safety-critical human behaviors for AV planner evaluation in interactive scenarios.<n>We parameterize complex human behaviors using low-dimensional goal positions, which are then fed into a promptable traffic simulator, ProSim.<n>To automate test generation, we introduce a prompt generation module that explores the goal domain and efficiently identifies safety-critical behaviors using Bayesian optimization.
arXiv Detail & Related papers (2025-06-01T22:29:32Z) - SafeAgent: Safeguarding LLM Agents via an Automated Risk Simulator [77.86600052899156]
Large Language Model (LLM)-based agents are increasingly deployed in real-world applications.<n>We propose AutoSafe, the first framework that systematically enhances agent safety through fully automated synthetic data generation.<n>We show that AutoSafe boosts safety scores by 45% on average and achieves a 28.91% improvement on real-world tasks.
arXiv Detail & Related papers (2025-05-23T10:56:06Z) - Impact Analysis of Inference Time Attack of Perception Sensors on Autonomous Vehicles [11.693109854958479]
We propose an impact analysis based on inference time attacks for autonomous vehicles.<n>We demonstrate in a simulation system that such inference time attacks can also threaten the safety of both the ego vehicle and other traffic participants.
arXiv Detail & Related papers (2025-05-05T23:00:27Z) - Generating Critical Scenarios for Testing Automated Driving Systems [5.975915967339764]
AVASTRA is a Reinforcement Learning-based approach to generate realistic critical scenarios for testing Autonomous Driving System.<n>Results show AVASTRA's ability to outperform the state-of-the-art approach by generating 30% to 115% more collision scenarios.
arXiv Detail & Related papers (2024-12-03T16:59:30Z) - Work-in-Progress: Crash Course: Can (Under Attack) Autonomous Driving Beat Human Drivers? [60.51287814584477]
This paper evaluates the inherent risks in autonomous driving by examining the current landscape of AVs.
We develop specific claims highlighting the delicate balance between the advantages of AVs and potential security challenges in real-world scenarios.
arXiv Detail & Related papers (2024-05-14T09:42:21Z) - A novel framework for adaptive stress testing of autonomous vehicles in
highways [3.2112502548606825]
We propose a novel framework to explore corner cases that can result in safety concerns in a highway traffic scenario.
We develop a new reward function for DRL to guide the AST in identifying crash scenarios based on the collision probability estimate.
The proposed framework is further integrated with a new driving model enabling us to create more realistic traffic scenarios.
arXiv Detail & Related papers (2024-02-19T04:02:40Z) - A Safety-Adapted Loss for Pedestrian Detection in Automated Driving [13.676179470606844]
In safety-critical domains, errors by the object detector may endanger pedestrians and other vulnerable road users.
We propose a safety-aware loss variation that leverages the estimated per-pedestrian criticality scores during training.
arXiv Detail & Related papers (2024-02-05T13:16:38Z) - CAT: Closed-loop Adversarial Training for Safe End-to-End Driving [54.60865656161679]
Adversarial Training (CAT) is a framework for safe end-to-end driving in autonomous vehicles.
Cat aims to continuously improve the safety of driving agents by training the agent on safety-critical scenarios.
Cat can effectively generate adversarial scenarios countering the agent being trained.
arXiv Detail & Related papers (2023-10-19T02:49:31Z) - ASSERT: Automated Safety Scenario Red Teaming for Evaluating the
Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection.
We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance.
We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z) - A Counterfactual Safety Margin Perspective on the Scoring of Autonomous
Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors.
We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.