Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding
- URL: http://arxiv.org/abs/2405.18180v1
- Date: Tue, 28 May 2024 13:47:21 GMT
- Title: Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding
- Authors: Daniel Bethell, Simos Gerasimou, Radu Calinescu, Calum Imrie,
- Abstract summary: Training RL agents in unknown, black-box environments poses an even greater safety risk when prior knowledge of the domain/task is unavailable.
We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training.
- Score: 5.5929450570003185
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Empowering safe exploration of reinforcement learning (RL) agents during training is a critical impediment towards deploying RL agents in many real-world scenarios. Training RL agents in unknown, black-box environments poses an even greater safety risk when prior knowledge of the domain/task is unavailable. We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training, thus protecting the RL agent from executing actions that yield potentially hazardous outcomes. Our comprehensive experimental evaluation against state-of-the-art safe RL exploration techniques demonstrates how ADVICE can significantly reduce safety violations during training while maintaining a competitive outcome reward.
Related papers
- ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration.
We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time.
In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z) - Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL)
We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z) - Safeguarded Progress in Reinforcement Learning: Safe Bayesian
Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL)
We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs.
We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z) - Safe Reinforcement Learning with Contrastive Risk Prediction [35.80144544954927]
We propose a risk preventive training method for safe RL, which learns a statistical contrastive classifier to predict the probability of a state-action pair leading to unsafe states.
Based on the predicted risk probabilities, we can collect risk preventive trajectories and reshape the reward function with risk penalties to induce safe RL policies.
The results show the proposed approach has comparable performance with the state-of-the-art model-based methods and outperforms conventional model-free safe RL approaches.
arXiv Detail & Related papers (2022-09-10T18:54:38Z) - On the Robustness of Safe Reinforcement Learning under Observational
Perturbations [27.88525130218356]
We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL.
One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward.
This work sheds light on the inherited connection between observational robustness and safety in RL and provides a pioneer work for future safe RL studies.
arXiv Detail & Related papers (2022-05-29T15:25:03Z) - DESTA: A Framework for Safe Reinforcement Learning with Markov Games of
Intervention [17.017957942831938]
Current approaches for tackling safe learning in reinforcement learning (RL) lead to a trade-off between safe exploration and fulfilling the task.
We introduce a new two-player framework for safe RL called Distributive Exploration Safety Training Algorithm (DESTA)
Our approach uses a new two-player framework for safe RL called Distributive Exploration Safety Training Algorithm (DESTA)
arXiv Detail & Related papers (2021-10-27T14:35:00Z) - Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL)
We learn a conservative safety estimate of environment states through a critic.
We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.