Safety Shielding under Delayed Observation
- URL: http://arxiv.org/abs/2307.02164v1
- Date: Wed, 5 Jul 2023 10:06:10 GMT
- Title: Safety Shielding under Delayed Observation
- Authors: Filip Cano C\'ordoba, Alexander Palmisano, Martin Fr\"anzle, Roderick
Bloem, Bettina K\"onighofer
- Abstract summary: Shields are correct-by-construction enforcers that guarantee safe execution.
Shields should pick the safe corrective actions in such a way that future interferences are most likely minimized.
We present the first integration of shields in a realistic driving simulator.
- Score: 59.86192283565134
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Agents operating in physical environments need to be able to handle delays in
the input and output signals since neither data transmission nor sensing or
actuating the environment are instantaneous. Shields are
correct-by-construction runtime enforcers that guarantee safe execution by
correcting any action that may cause a violation of a formal safety
specification. Besides providing safety guarantees, shields should interfere
minimally with the agent. Therefore, shields should pick the safe corrective
actions in such a way that future interferences are most likely minimized.
Current shielding approaches do not consider possible delays in the input
signals in their safety analyses. In this paper, we address this issue. We
propose synthesis algorithms to compute \emph{delay-resilient shields} that
guarantee safety under worst-case assumptions on the delays of the input
signals. We also introduce novel heuristics for deciding between multiple
corrective actions, designed to minimize future shield interferences caused by
delays. As a further contribution, we present the first integration of shields
in a realistic driving simulator. We implemented our delayed shields in the
driving simulator \textsc{Carla}. We shield potentially unsafe autonomous
driving agents in different safety-critical scenarios and show the effect of
delays on the safety analysis.
Related papers
- Synthesizing Efficient and Permissive Programmatic Runtime Shields for Neural Policies [7.831197018945118]
We propose a novel framework that synthesizes lightweight and permissive programmatic runtime shields for neural policies.
Aegis achieves this by formulating the seeking of a runtime shield as a sketch-based program synthesis problem.
Compared to the current state-of-the-art, Aegis's shields exhibit a 2.1$times$ reduction in time overhead and a 4.4$times$ reduction in memory usage.
arXiv Detail & Related papers (2024-10-08T02:44:55Z) - Realizable Continuous-Space Shields for Safe Reinforcement Learning [13.728961635717134]
Deep Reinforcement Learning (DRL) remains vulnerable to occasional catastrophic failures without additional safeguards.
One effective solution is to use a shield that validates and adjusts the agent's actions to ensure compliance with a provided set of safety specifications.
We propose the first shielding approach to automatically guarantee the realizability of safety requirements for continuous state and action spaces.
arXiv Detail & Related papers (2024-10-02T21:08:11Z) - Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins.
We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z) - Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding.
We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z) - Online Shielding for Reinforcement Learning [59.86192283565134]
We propose an approach for online safety shielding of RL agents.
During runtime, the shield analyses the safety of each available action.
Based on this probability and a given threshold, the shield decides whether to block an action from the agent.
arXiv Detail & Related papers (2022-12-04T16:00:29Z) - Sample-Efficient Safety Assurances using Conformal Prediction [57.92013073974406]
Early warning systems can provide alerts when an unsafe situation is imminent.
To reliably improve safety, these warning systems should have a provable false negative rate.
We present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics.
arXiv Detail & Related papers (2021-09-28T23:00:30Z) - It's Time to Play Safe: Shield Synthesis for Timed Systems [53.796331564067835]
We show how to synthesize timed shields from timed safety properties given as timed automata.
A timed shield enforces the safety of a running system while interfering with the system as little as possible.
arXiv Detail & Related papers (2020-06-30T11:21:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.