Related papers: Safety Shielding under Delayed Observation

Safety Shielding under Delayed Observation

URL: http://arxiv.org/abs/2307.02164v1
Date: Wed, 5 Jul 2023 10:06:10 GMT
Title: Safety Shielding under Delayed Observation
Authors: Filip Cano C\'ordoba, Alexander Palmisano, Martin Fr\"anzle, Roderick Bloem, Bettina K\"onighofer
Abstract summary: Shields are correct-by-construction enforcers that guarantee safe execution. Shields should pick the safe corrective actions in such a way that future interferences are most likely minimized. We present the first integration of shields in a realistic driving simulator.
Score: 59.86192283565134
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Agents operating in physical environments need to be able to handle delays in the input and output signals since neither data transmission nor sensing or actuating the environment are instantaneous. Shields are correct-by-construction runtime enforcers that guarantee safe execution by correcting any action that may cause a violation of a formal safety specification. Besides providing safety guarantees, shields should interfere minimally with the agent. Therefore, shields should pick the safe corrective actions in such a way that future interferences are most likely minimized. Current shielding approaches do not consider possible delays in the input signals in their safety analyses. In this paper, we address this issue. We propose synthesis algorithms to compute \emph{delay-resilient shields} that guarantee safety under worst-case assumptions on the delays of the input signals. We also introduce novel heuristics for deciding between multiple corrective actions, designed to minimize future shield interferences caused by delays. As a further contribution, we present the first integration of shields in a realistic driving simulator. We implemented our delayed shields in the driving simulator \textsc{Carla}. We shield potentially unsafe autonomous driving agents in different safety-critical scenarios and show the effect of delays on the safety analysis.

Related papers

Efficient Dynamic Shielding for Parametric Safety Specifications [2.1829548755022423]
Shielding is a runtime safety enforcement tool that needs to monitor and intervene the AI controller's actions if safety could be compromised otherwise.<n>We introduce dynamic shields for parametric safety specifications, which are succinctly represented sets of all possible safety specifications that may be encountered at runtime.<n>In our experiments, the dynamic shields took a few minutes for their offline design, and took between a fraction of a second and a few seconds for online adaptation at each step, whereas the brute-force online recomputation approach was up to 5 times slower.
arXiv Detail & Related papers (2025-05-28T08:30:03Z)
Synthesizing Efficient and Permissive Programmatic Runtime Shields for Neural Policies [7.831197018945118]
We propose a novel framework that synthesizes lightweight and permissive programmatic runtime shields for neural policies. Aegis achieves this by formulating the seeking of a runtime shield as a sketch-based program synthesis problem. Compared to the current state-of-the-art, Aegis's shields exhibit a 2.1$times$ reduction in time overhead and a 4.4$times$ reduction in memory usage.
arXiv Detail & Related papers (2024-10-08T02:44:55Z)
Realizable Continuous-Space Shields for Safe Reinforcement Learning [13.728961635717134]
Deep Reinforcement Learning (DRL) remains vulnerable to occasional catastrophic failures without additional safeguards. One effective solution is to use a shield that validates and adjusts the agent's actions to ensure compliance with a provided set of safety specifications. We propose the first shielding approach to automatically guarantee the realizability of safety requirements for continuous state and action spaces.
arXiv Detail & Related papers (2024-10-02T21:08:11Z)
Safety Margins for Reinforcement Learning [53.10194953873209]
We show how to leverage proxy criticality metrics to generate safety margins. We evaluate our approach on learned policies from APE-X and A3C within an Atari environment.
arXiv Detail & Related papers (2023-07-25T16:49:54Z)
Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding. We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z)
Online Shielding for Reinforcement Learning [59.86192283565134]
We propose an approach for online safety shielding of RL agents. During runtime, the shield analyses the safety of each available action. Based on this probability and a given threshold, the shield decides whether to block an action from the agent.
arXiv Detail & Related papers (2022-12-04T16:00:29Z)
Automata Learning meets Shielding [1.1417805445492082]
Safety is still one of the major research challenges in reinforcement learning (RL) In this paper, we address the problem of how to avoid safety violations of RL agents during exploration in probabilistic and partially unknown environments. Our approach combines automata learning for Markov Decision Processes (MDPs) and shield synthesis in an iterative approach.
arXiv Detail & Related papers (2022-12-04T14:58:12Z)
Sample-Efficient Safety Assurances using Conformal Prediction [57.92013073974406]
Early warning systems can provide alerts when an unsafe situation is imminent. To reliably improve safety, these warning systems should have a provable false negative rate. We present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics.
arXiv Detail & Related papers (2021-09-28T23:00:30Z)
It's Time to Play Safe: Shield Synthesis for Timed Systems [53.796331564067835]
We show how to synthesize timed shields from timed safety properties given as timed automata. A timed shield enforces the safety of a running system while interfering with the system as little as possible.
arXiv Detail & Related papers (2020-06-30T11:21:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.