Related papers: Efficient statistical validation with edge cases to evaluate Highly Automated Vehicles

Efficient statistical validation with edge cases to evaluate Highly Automated Vehicles

URL: http://arxiv.org/abs/2003.01886v1
Date: Wed, 4 Mar 2020 04:35:22 GMT
Title: Efficient statistical validation with edge cases to evaluate Highly Automated Vehicles
Authors: Dhanoop Karunakaran, Stewart Worrall, Eduardo Nebot
Abstract summary: The widescale deployment of Autonomous Vehicles seems to be imminent despite many safety challenges that are yet to be resolved. Existing standards focus on deterministic processes where the validation requires only a set of test cases that cover the requirements. This paper presents a new approach to compute the statistical characteristics of a system's behaviour by biasing automatically generated test cases towards the worst case scenarios.
Score: 6.198523595657983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The widescale deployment of Autonomous Vehicles (AV) seems to be imminent despite many safety challenges that are yet to be resolved. It is well known that there are no universally agreed Verification and Validation (VV) methodologies to guarantee absolute safety, which is crucial for the acceptance of this technology. Existing standards focus on deterministic processes where the validation requires only a set of test cases that cover the requirements. Modern autonomous vehicles will undoubtedly include machine learning and probabilistic techniques that require a much more comprehensive testing regime due to the non-deterministic nature of the operating design domain. A rigourous statistical validation process is an essential component required to address this challenge. Most research in this area focuses on evaluating system performance in large scale real-world data gathering exercises (number of miles travelled), or randomised test scenarios in simulation. This paper presents a new approach to compute the statistical characteristics of a system's behaviour by biasing automatically generated test cases towards the worst case scenarios, identifying potential unsafe edge cases.We use reinforcement learning (RL) to learn the behaviours of simulated actors that cause unsafe behaviour measured by the well established RSS safety metric. We demonstrate that by using the method we can more efficiently validate a system using a smaller number of test cases by focusing the simulation towards the worst case scenario, generating edge cases that correspond to unsafe situations.

Related papers

Towards provable probabilistic safety for scalable embodied AI systems [79.31011047593492]
Embodied AI systems are increasingly prevalent across various applications.<n> Ensuring their safety in complex operating environments remains a major challenge.<n>This Perspective offers a pathway toward safer, large-scale adoption of embodied AI systems in safety-critical applications.
arXiv Detail & Related papers (2025-06-05T15:46:25Z)
On the Need for a Statistical Foundation in Scenario-Based Testing of Autonomous Vehicles [4.342427756164555]
This paper argues that a rigorous statistical foundation is essential to address these challenges and enable rigorous safety assurance.<n>By drawing parallels between AV testing and established software testing methods, we identify shared research gaps and reusable solutions.<n>Our analysis reveals that neither scenario-based nor mile-based testing universally outperforms the other.
arXiv Detail & Related papers (2025-05-04T22:06:23Z)
A Domain-Agnostic Scalable AI Safety Ensuring Framework [8.086635708001166]
Current approaches to AI safety typically address domain-specific safety conditions. We propose a novel AI safety framework that ensures AI systems comply with any user-defined constraint. We demonstrate our framework's effectiveness through experiments in diverse domains.
arXiv Detail & Related papers (2025-04-29T16:38:35Z)
Automatically Adaptive Conformal Risk Control [49.95190019041905]
We propose a methodology for achieving approximate conditional control of statistical risks by adapting to the difficulty of test samples. Our framework goes beyond traditional conditional risk control based on user-provided conditioning events to the algorithmic, data-driven determination of appropriate function classes for conditioning.
arXiv Detail & Related papers (2024-06-25T08:29:32Z)
ASSERT: Automated Safety Scenario Red Teaming for Evaluating the Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection. We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance. We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z)
Bayesian Safety Validation for Failure Probability Estimation of Black-Box Systems [34.61865848439637]
Estimating the probability of failure is an important step in the certification of safety-critical systems. This work frames the problem of black-box safety validation as a Bayesian optimization problem. The algorithm is designed to search for failures, compute the most-likely failure, and estimate the failure probability over an operating domain.
arXiv Detail & Related papers (2023-05-03T22:22:48Z)
Adaptive Failure Search Using Critical States from Domain Experts [9.93890332477992]
Failure search may be done through logging substantial vehicle miles in either simulation or real world testing. AST is one such method that poses the problem of failure search as a Markov decision process. We show that the incorporation of critical states into the AST framework generates failure scenarios with increased safety violations.
arXiv Detail & Related papers (2023-04-01T18:14:41Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles [86.9067793493874]
We propose efficient mechanisms to characterize and generate testing scenarios using a state-of-the-art driving simulator. We use our method to characterize real driving data from the Next Generation Simulation (NGSIM) project. We rank the scenarios by defining metrics based on the complexity of avoiding accidents and provide insights into how the AV could have minimized the probability of incurring an accident.
arXiv Detail & Related papers (2021-03-12T17:00:23Z)
Efficient falsification approach for autonomous vehicle validation using a parameter optimisation technique based on reinforcement learning [6.198523595657983]
The widescale deployment of Autonomous Vehicles (AV) appears to be imminent despite many safety challenges that are yet to be resolved. The uncertainties in the behaviour of the traffic participants and the dynamic world cause reactions in advanced autonomous systems. This paper presents an efficient falsification method to evaluate the System Under Test.
arXiv Detail & Related papers (2020-11-16T02:56:13Z)
Evaluating the Safety of Deep Reinforcement Learning Models using Semi-Formal Verification [81.32981236437395]
We present a semi-formal verification approach for decision-making tasks based on interval analysis. Our method obtains comparable results over standard benchmarks with respect to formal verifiers. Our approach allows to efficiently evaluate safety properties for decision-making models in practical applications.
arXiv Detail & Related papers (2020-10-19T11:18:06Z)
Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation [23.43175124406634]
Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks. We propose a flow-based multimodal safety-critical scenario generator for evaluating decisionmaking algorithms. We evaluate six Reinforcement Learning algorithms with our generated traffic scenarios and provide empirical conclusions about their robustness.
arXiv Detail & Related papers (2020-09-16T15:16:43Z)
Neural Bridge Sampling for Evaluating Safety-Critical Autonomous Systems [34.945482759378734]
We employ a probabilistic approach to safety evaluation in simulation, where we are concerned with computing the probability of dangerous events. We develop a novel rare-event simulation method that combines exploration, exploitation, and optimization techniques to find failure modes and estimate their rate of occurrence.
arXiv Detail & Related papers (2020-08-24T17:46:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.