Assessing Safety-Critical Systems from Operational Testing: A Study on
Autonomous Vehicles
- URL: http://arxiv.org/abs/2008.09510v1
- Date: Wed, 19 Aug 2020 19:50:56 GMT
- Title: Assessing Safety-Critical Systems from Operational Testing: A Study on
Autonomous Vehicles
- Authors: Xingyu Zhao, Kizito Salako, Lorenzo Strigini, Valentin Robu, David
Flynn
- Abstract summary: Demonstrating high reliability and safety for safety-critical systems (SCSs) remains a hard problem.
We use Autonomous Vehicles (AVs) as a current example to revisit the problem of demonstrating high reliability.
- Score: 3.629865579485447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Context: Demonstrating high reliability and safety for safety-critical
systems (SCSs) remains a hard problem. Diverse evidence needs to be combined in
a rigorous way: in particular, results of operational testing with other
evidence from design and verification. Growing use of machine learning in SCSs,
by precluding most established methods for gaining assurance, makes operational
testing even more important for supporting safety and reliability claims.
Objective: We use Autonomous Vehicles (AVs) as a current example to revisit the
problem of demonstrating high reliability. AVs are making their debut on public
roads: methods for assessing whether an AV is safe enough are urgently needed.
We demonstrate how to answer 5 questions that would arise in assessing an AV
type, starting with those proposed by a highly-cited study. Method: We apply
new theorems extending Conservative Bayesian Inference (CBI), which exploit the
rigour of Bayesian methods while reducing the risk of involuntary misuse
associated with now-common applications of Bayesian inference; we define
additional conditions needed for applying these methods to AVs. Results: Prior
knowledge can bring substantial advantages if the AV design allows strong
expectations of safety before road testing. We also show how naive attempts at
conservative assessment may lead to over-optimism instead; why extrapolating
the trend of disengagements is not suitable for safety claims; use of knowledge
that an AV has moved to a less stressful environment. Conclusion: While some
reliability targets will remain too high to be practically verifiable, CBI
removes a major source of doubt: it allows use of prior knowledge without
inducing dangerously optimistic biases. For certain ranges of required
reliability and prior beliefs, CBI thus supports feasible, sound arguments.
Useful conservative claims can be derived from limited prior knowledge.
Related papers
- Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress? [59.96471873997733]
We propose an empirical foundation for developing more meaningful safety metrics and define AI safety in a machine learning research context.
We aim to provide a more rigorous framework for AI safety research, advancing the science of safety evaluations and clarifying the path towards measurable progress.
arXiv Detail & Related papers (2024-07-31T17:59:24Z) - FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality [13.240598841087841]
We introduce FREA, a novel safety-critical scenarios generation method that incorporates the Largest Feasible Region (LFR) of AV as guidance.
Experiments illustrate that FREA can effectively generate safety-critical scenarios, yielding considerable near-miss events.
arXiv Detail & Related papers (2024-06-05T06:26:15Z) - Revisiting Confidence Estimation: Towards Reliable Failure Prediction [53.79160907725975]
We find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors.
We propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance.
arXiv Detail & Related papers (2024-03-05T11:44:14Z) - The Art of Defending: A Systematic Evaluation and Analysis of LLM
Defense Strategies on Safety and Over-Defensiveness [56.174255970895466]
Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications.
This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark.
arXiv Detail & Related papers (2023-12-30T17:37:06Z) - Planning Reliability Assurance Tests for Autonomous Vehicles [5.590179847470922]
One important application of AI technology is the development of autonomous vehicles (AV)
To plan for an assurance test, one needs to determine how many AVs need to be tested for how many miles and the standard for passing the test.
This paper develops statistical methods for planning AV reliability assurance tests based on recurrent events data.
arXiv Detail & Related papers (2023-11-30T20:48:20Z) - A Counterfactual Safety Margin Perspective on the Scoring of Autonomous
Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors.
We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z) - Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs.
We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z) - Demonstrating Software Reliability using Possibly Correlated Tests:
Insights from a Conservative Bayesian Approach [2.152298082788376]
We formalise informal notions of "doubting" that the executions are independent.
We develop techniques that reveal the extent to which independence assumptions can undermine conservatism in assessments.
arXiv Detail & Related papers (2022-08-16T20:27:47Z) - Bootstrapping confidence in future safety based on past safe operation [0.0]
We show an approach to confidence of low enough probability of causing accidents in the early phases of operation.
This formalises the common approach of operating a system on a limited basis in the hope that mishap-free operation will confirm one's confidence in its safety.
arXiv Detail & Related papers (2021-10-20T18:36:23Z) - Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL)
We learn a conservative safety estimate of environment states through a critic.
We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z) - A Comparison of Uncertainty Estimation Approaches in Deep Learning
Components for Autonomous Vehicle Applications [0.0]
Key factor for ensuring safety in Autonomous Vehicles (AVs) is to avoid any abnormal behaviors under undesirable and unpredicted circumstances.
Different methods for uncertainty quantification have recently been proposed to measure the inevitable source of errors in data and models.
These methods require a higher computational load, a higher memory footprint, and introduce extra latency, which can be prohibitive in safety-critical applications.
arXiv Detail & Related papers (2020-06-26T18:55:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.