Related papers: Safe Reinforcement Learning with Contrastive Risk Prediction

Safe Reinforcement Learning with Contrastive Risk Prediction

URL: http://arxiv.org/abs/2209.09648v1
Date: Sat, 10 Sep 2022 18:54:38 GMT
Title: Safe Reinforcement Learning with Contrastive Risk Prediction
Authors: Hanping Zhang, Yuhong Guo
Abstract summary: We propose a risk preventive training method for safe RL, which learns a statistical contrastive classifier to predict the probability of a state-action pair leading to unsafe states. Based on the predicted risk probabilities, we can collect risk preventive trajectories and reshape the reward function with risk penalties to induce safe RL policies. The results show the proposed approach has comparable performance with the state-of-the-art model-based methods and outperforms conventional model-free safe RL approaches.
Score: 35.80144544954927
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As safety violations can lead to severe consequences in real-world robotic applications, the increasing deployment of Reinforcement Learning (RL) in robotic domains has propelled the study of safe exploration for reinforcement learning (safe RL). In this work, we propose a risk preventive training method for safe RL, which learns a statistical contrastive classifier to predict the probability of a state-action pair leading to unsafe states. Based on the predicted risk probabilities, we can collect risk preventive trajectories and reshape the reward function with risk penalties to induce safe RL policies. We conduct experiments in robotic simulation environments. The results show the proposed approach has comparable performance with the state-of-the-art model-based methods and outperforms conventional model-free safe RL approaches.

Related papers

Skill-based Safe Reinforcement Learning with Risk Planning [23.15178050525514]
Safe Reinforcement Learning (Safe RL) aims to ensure safety when an RL agent conducts learning by interacting with real-world environments.<n>We propose a novel Safe Skill Planning (SSkP) approach to enhance effective safe RL by exploiting auxiliary offline demonstration data.
arXiv Detail & Related papers (2025-05-02T22:48:27Z)
Probabilistic Shielding for Safe Reinforcement Learning [51.35559820893218]
In real-life scenarios, a Reinforcement Learning (RL) agent must often also behave in a safe manner, including at training time. We present a new, scalable method, which enjoys strict formal guarantees for Safe RL. We show that our approach provides a strict formal safety guarantee that the agent stays safe at training and test time.
arXiv Detail & Related papers (2025-03-09T17:54:33Z)
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z)
Anomalous State Sequence Modeling to Enhance Safety in Reinforcement Learning [0.0]
We propose a safe reinforcement learning (RL) approach that utilizes an anomalous state sequence to enhance RL safety. In experiments on multiple safety-critical environments including self-driving cars, our solution approach successfully learns safer policies.
arXiv Detail & Related papers (2024-07-29T10:30:07Z)
Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding [5.5929450570003185]
Training RL agents in unknown, black-box environments poses an even greater safety risk when prior knowledge of the domain/task is unavailable. We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training.
arXiv Detail & Related papers (2024-05-28T13:47:21Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Probabilistic Counterexample Guidance for Safer Reinforcement Learning (Extended Version) [1.279257604152629]
Safe exploration aims at addressing the limitations of Reinforcement Learning (RL) in safety-critical scenarios. Several methods exist to incorporate external knowledge or to use sensor data to limit the exploration of unsafe states. In this paper, we target the problem of safe exploration by guiding the training with counterexamples of the safety requirement.
arXiv Detail & Related papers (2023-07-10T22:28:33Z)
Safety Correction from Baseline: Towards the Risk-aware Policy in Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent. Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control. The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z)
Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate [6.581362609037603]
We build a safe reinforcement learning framework to resolve constraints required by the DRC and its corresponding shield policy. We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy.
arXiv Detail & Related papers (2022-10-14T06:16:53Z)
Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL) We learn a conservative safety estimate of environment states through a critic. We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z)
Cautious Reinforcement Learning with Logical Constraints [78.96597639789279]
An adaptive safe padding forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm.
arXiv Detail & Related papers (2020-02-26T00:01:08Z)
Safe reinforcement learning for probabilistic reachability and safety specifications: A Lyapunov-based approach [2.741266294612776]
We propose a model-free safety specification method that learns the maximal probability of safe operation. Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage. It yields a sequence of safe policies that determine the range of safe operation, called the safe set.
arXiv Detail & Related papers (2020-02-24T09:20:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.