Related papers: Positive Trust Balance for Self-Driving Car Deployment

Positive Trust Balance for Self-Driving Car Deployment

URL: http://arxiv.org/abs/2009.05801v1
Date: Sat, 12 Sep 2020 14:23:47 GMT
Title: Positive Trust Balance for Self-Driving Car Deployment
Authors: Philip Koopman, Michael Wagner
Abstract summary: Decision about when self-driving cars are ready to deploy is likely to be made with insufficient lagging metric data. A Positive Trust Balance approach can help with making a responsible deployment decision.
Score: 3.106768467227812
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The crucial decision about when self-driving cars are ready to deploy is likely to be made with insufficient lagging metric data to provide high confidence in an acceptable safety outcome. A Positive Trust Balance approach can help with making a responsible deployment decision despite this uncertainty. With this approach, a reasonable initial expectation of safety is based on a combination of a practicable amount of testing, engineering rigor, safety culture, and a strong commitment to use post-deployment operational feedback to further reduce uncertainty. This can enable faster deployment than would be required by more traditional safety approaches by reducing the confidence necessary at time of deployment in exchange for a more stringent requirement for Safety Performance Indicator (SPI) field feedback in the context of a strong safety culture.

Related papers

Shape it Up! Restoring LLM Safety during Finetuning [66.46166656543761]
Finetuning large language models (LLMs) enables user-specific customization but introduces critical safety risks.<n>We propose dynamic safety shaping (DSS), a framework that uses fine-grained safety signals to reinforce learning from safe segments of a response while suppressing unsafe content.<n>We present STAR-DSS, guided by STAR scores, that robustly mitigates finetuning risks and delivers substantial safety improvements across diverse threats, datasets, and model families.
arXiv Detail & Related papers (2025-05-22T18:05:16Z)
SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning [76.56522719330911]
Large Reasoning Models (LRMs) introduce a new generation paradigm of explicitly reasoning before answering.<n>LRMs pose great safety risks against harmful queries and adversarial attacks.<n>We propose SafeKey to better activate the safety aha moment in the key sentence.
arXiv Detail & Related papers (2025-05-22T03:46:03Z)
Probabilistic Shielding for Safe Reinforcement Learning [51.35559820893218]
In real-life scenarios, a Reinforcement Learning (RL) agent must often also behave in a safe manner, including at training time. We present a new, scalable method, which enjoys strict formal guarantees for Safe RL. We show that our approach provides a strict formal safety guarantee that the agent stays safe at training and test time.
arXiv Detail & Related papers (2025-03-09T17:54:33Z)
Enhancing Safety in Reinforcement Learning with Human Feedback via Rectified Policy Optimization [16.35399722653875]
We propose Rectified Policy Optimization (RePO) to balance helpfulness and safety (harmlessness) in large language models (LLMs) At the core of RePO is a policy update mechanism driven by rectified policy gradients, which penalizes the strict safety violation of every prompt, thereby enhancing safety across nearly all prompts.
arXiv Detail & Related papers (2024-10-25T19:08:23Z)
Realizable Continuous-Space Shields for Safe Reinforcement Learning [13.728961635717134]
Deep Reinforcement Learning (DRL) remains vulnerable to occasional catastrophic failures without additional safeguards. One effective solution is to use a shield that validates and adjusts the agent's actions to ensure compliance with a provided set of safety specifications. We propose the first shielding approach to automatically guarantee the realizability of safety requirements for continuous state and action spaces.
arXiv Detail & Related papers (2024-10-02T21:08:11Z)
Criticality and Safety Margins for Reinforcement Learning [53.10194953873209]
We seek to define a criticality framework with both a quantifiable ground truth and a clear significance to users. We introduce true criticality as the expected drop in reward when an agent deviates from its policy for n consecutive random actions. We also introduce the concept of proxy criticality, a low-overhead metric that has a statistically monotonic relationship to true criticality.
arXiv Detail & Related papers (2024-09-26T21:00:45Z)
CONClave -- Secure and Robust Cooperative Perception for CAVs Using Authenticated Consensus and Trust Scoring [0.9912132935716113]
ConClave provides comprehensive security and reliability for cooperative perception in autonomous vehicles. ConClave shows huge promise in preventing security flaws, detecting even relatively minor sensing faults, and increasing the robustness and accuracy of cooperative perception in CAVs.
arXiv Detail & Related papers (2024-09-04T16:42:40Z)
Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL) We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z)
Did You Mean...? Confidence-based Trade-offs in Semantic Parsing [52.28988386710333]
We show how a calibrated model can help balance common trade-offs in task-oriented parsing. We then examine how confidence scores can help optimize the trade-off between usability and safety.
arXiv Detail & Related papers (2023-03-29T17:07:26Z)
Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees [14.107064796593225]
We introduce a safe reinforcement learning framework that incorporates robustness through the use of an optimal transport cost uncertainty set. In experiments on continuous control tasks with safety constraints, our approach demonstrates robust performance while significantly improving safety at deployment time.
arXiv Detail & Related papers (2023-01-31T02:39:52Z)
Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z)
Bootstrapping confidence in future safety based on past safe operation [0.0]
We show an approach to confidence of low enough probability of causing accidents in the early phases of operation. This formalises the common approach of operating a system on a limited basis in the hope that mishap-free operation will confirm one's confidence in its safety.
arXiv Detail & Related papers (2021-10-20T18:36:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.