It's Time to Play Safe: Shield Synthesis for Timed Systems
- URL: http://arxiv.org/abs/2006.16688v1
- Date: Tue, 30 Jun 2020 11:21:42 GMT
- Title: It's Time to Play Safe: Shield Synthesis for Timed Systems
- Authors: Roderick Bloem, Peter Gj{\o}l Jensen, Bettina K\"onighofer, Kim
Guldstrand Larsen, Florian Lorber and Alexander Palmisano
- Abstract summary: We show how to synthesize timed shields from timed safety properties given as timed automata.
A timed shield enforces the safety of a running system while interfering with the system as little as possible.
- Score: 53.796331564067835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Erroneous behaviour in safety critical real-time systems may inflict serious
consequences. In this paper, we show how to synthesize timed shields from timed
safety properties given as timed automata. A timed shield enforces the safety
of a running system while interfering with the system as little as possible. We
present timed post-shields and timed pre-shields. A timed pre-shield is placed
before the system and provides a set of safe outputs. This set restricts the
choices of the system. A timed post-shield is implemented after the system. It
monitors the system and corrects the system's output only if necessary. We
further extend the timed post-shield construction to provide a guarantee on the
recovery phase, i.e., the time between a specification violation and the point
at which full control can be handed back to the system. In our experimental
results, we use timed post-shields to ensure the safety in a reinforcement
learning setting for controlling a platoon of cars, during the learning and
execution phase, and study the effect.
Related papers
- Compositional Shielding and Reinforcement Learning for Multi-Agent Systems [1.124958340749622]
Deep reinforcement learning has emerged as a powerful tool for obtaining high-performance policies.
One promising paradigm to guarantee safety is a shield, which shields a policy from making unsafe actions.
In this work, we propose a novel approach for multi-agent shielding.
arXiv Detail & Related papers (2024-10-14T12:52:48Z) - Synthesizing Efficient and Permissive Programmatic Runtime Shields for Neural Policies [7.831197018945118]
We propose a novel framework that synthesizes lightweight and permissive programmatic runtime shields for neural policies.
Aegis achieves this by formulating the seeking of a runtime shield as a sketch-based program synthesis problem.
Compared to the current state-of-the-art, Aegis's shields exhibit a 2.1$times$ reduction in time overhead and a 4.4$times$ reduction in memory usage.
arXiv Detail & Related papers (2024-10-08T02:44:55Z) - Shielded Reinforcement Learning for Hybrid Systems [1.0485739694839669]
Reinforcement learning has been leveraged to construct near-optimal controllers, but their behavior is not guaranteed to be safe.
One way of imposing safety to a learned controller is to use a shield, which is correct by design.
We propose the construction of a shield using the so-called barbaric method, where an approximate finite representation of an underlying partition-based two-player safety game is extracted.
arXiv Detail & Related papers (2023-08-28T09:04:52Z) - Safety Shielding under Delayed Observation [59.86192283565134]
Shields are correct-by-construction enforcers that guarantee safe execution.
Shields should pick the safe corrective actions in such a way that future interferences are most likely minimized.
We present the first integration of shields in a realistic driving simulator.
arXiv Detail & Related papers (2023-07-05T10:06:10Z) - Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding.
We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z) - Forecasting Particle Accelerator Interruptions Using Logistic LASSO
Regression [62.997667081978825]
Unforeseen particle accelerator interruptions, also known as interlocks, lead to abrupt operational changes despite being necessary safety measures.
We propose a simple yet powerful binary classification model aiming to forecast such interruptions.
The model is formulated as logistic regression penalized by at least absolute shrinkage and selection operator.
arXiv Detail & Related papers (2023-03-15T23:11:30Z) - Online Shielding for Reinforcement Learning [59.86192283565134]
We propose an approach for online safety shielding of RL agents.
During runtime, the shield analyses the safety of each available action.
Based on this probability and a given threshold, the shield decides whether to block an action from the agent.
arXiv Detail & Related papers (2022-12-04T16:00:29Z) - Sample-Efficient Safety Assurances using Conformal Prediction [57.92013073974406]
Early warning systems can provide alerts when an unsafe situation is imminent.
To reliably improve safety, these warning systems should have a provable false negative rate.
We present a framework that combines a statistical inference technique known as conformal prediction with a simulator of robot/environment dynamics.
arXiv Detail & Related papers (2021-09-28T23:00:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.