Interpreting Safety Outcomes: Waymo's Performance Evaluation in the
Context of a Broader Determination of Safety Readiness
- URL: http://arxiv.org/abs/2306.14923v1
- Date: Fri, 23 Jun 2023 14:26:40 GMT
- Title: Interpreting Safety Outcomes: Waymo's Performance Evaluation in the
Context of a Broader Determination of Safety Readiness
- Authors: Francesca M. Favaro, Trent Victor, Henning Hohnhold, Scott Schnelle
- Abstract summary: This paper highlights the need for a diversified approach to safety determination that complements the analysis of observed safety outcomes with other estimation techniques.
Our discussion highlights: the presentation of a "credibility paradox" within the comparison between ADS crash data and human-derived baselines, the recognition of continuous confidence growth through in-use monitoring, and the need to supplement any aggregate statistical analysis with appropriate event-level reasoning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper frames recent publications from Waymo within the broader context
of the safety readiness determination for an Automated Driving System (ADS).
Starting from a brief overview of safety performance outcomes reported by Waymo
(i.e., contact events experienced during fully autonomous operations), this
paper highlights the need for a diversified approach to safety determination
that complements the analysis of observed safety outcomes with other estimation
techniques. Our discussion highlights: the presentation of a "credibility
paradox" within the comparison between ADS crash data and human-derived
baselines; the recognition of continuous confidence growth through in-use
monitoring; and the need to supplement any aggregate statistical analysis with
appropriate event-level reasoning.
Related papers
- FREA: Feasibility-Guided Generation of Safety-Critical Scenarios with Reasonable Adversariality [13.240598841087841]
We introduce FREA, a novel safety-critical scenarios generation method.
It incorporates the Largest Feasible Region (LFR) of AV as guidance to ensure the reasonableness of the adversarial scenarios.
Experiments show that FREA can effectively generate safety-critical scenarios, yielding considerable near-miss events.
arXiv Detail & Related papers (2024-06-05T06:26:15Z) - The Art of Defending: A Systematic Evaluation and Analysis of LLM
Defense Strategies on Safety and Over-Defensiveness [56.174255970895466]
Large Language Models (LLMs) play an increasingly pivotal role in natural language processing applications.
This paper presents Safety and Over-Defensiveness Evaluation (SODE) benchmark.
arXiv Detail & Related papers (2023-12-30T17:37:06Z) - Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - ASSERT: Automated Safety Scenario Red Teaming for Evaluating the
Robustness of Large Language Models [65.79770974145983]
ASSERT, Automated Safety Scenario Red Teaming, consists of three methods -- semantically aligned augmentation, target bootstrapping, and adversarial knowledge injection.
We partition our prompts into four safety domains for a fine-grained analysis of how the domain affects model performance.
We find statistically significant performance differences of up to 11% in absolute classification accuracy among semantically related scenarios and error rates of up to 19% absolute error in zero-shot adversarial settings.
arXiv Detail & Related papers (2023-10-14T17:10:28Z) - A Counterfactual Safety Margin Perspective on the Scoring of Autonomous
Vehicles' Riskiness [52.27309191283943]
This paper presents a data-driven framework for assessing the risk of different AVs' behaviors.
We propose the notion of counterfactual safety margin, which represents the minimum deviation from nominal behavior that could cause a collision.
arXiv Detail & Related papers (2023-08-02T09:48:08Z) - Towards Safer Generative Language Models: A Survey on Safety Risks,
Evaluations, and Improvements [76.80453043969209]
This survey presents a framework for safety research pertaining to large models.
We begin by introducing safety issues of wide concern, then delve into safety evaluation methods for large models.
We explore the strategies for enhancing large model safety from training to deployment.
arXiv Detail & Related papers (2023-02-18T09:32:55Z) - Safety Analysis of Autonomous Driving Systems Based on Model Learning [16.38592243376647]
We present a practical verification method for safety analysis of the autonomous driving system (ADS)
The main idea is to build a surrogate model that quantitatively depicts the behaviour of an ADS in the specified traffic scenario.
We demonstrate the utility of the proposed approach by evaluating safety properties on the state-of-the-art ADS in literature.
arXiv Detail & Related papers (2022-11-23T06:52:40Z) - Architectural patterns for handling runtime uncertainty of data-driven
models in safety-critical perception [1.7616042687330642]
We present additional architectural patterns for handling uncertainty estimation.
We evaluate the four patterns qualitatively and quantitatively with respect to safety and performance gains.
We conclude that the consideration of context information of the driving situation makes it possible to accept more or less uncertainty depending on the inherent risk of the situation.
arXiv Detail & Related papers (2022-06-14T13:31:36Z) - Towards the Unification and Data-Driven Synthesis of Autonomous Vehicle
Safety Concepts [31.13851159912757]
We advocate for the use of Hamilton Jacobi (HJ) reachability as a unifying mathematical framework for comparing existing safety concepts.
We show that (i) existing predominant safety concepts can be embedded in the HJ reachability framework, thereby enabling a common language for comparing and contrasting modeling assumptions.
arXiv Detail & Related papers (2021-07-30T03:16:48Z) - SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics.
We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations.
We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.