Related papers: On the Robustness of Safe Reinforcement Learning under Observational Perturbations

On the Robustness of Safe Reinforcement Learning under Observational Perturbations

URL: http://arxiv.org/abs/2205.14691v1
Date: Sun, 29 May 2022 15:25:03 GMT
Title: On the Robustness of Safe Reinforcement Learning under Observational Perturbations
Authors: Zuxin Liu, Zijian Guo, Zhepeng Cen, Huan Zhang, Jie Tan, Bo Li, Ding Zhao
Abstract summary: We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL. One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward. This work sheds light on the inherited connection between observational robustness and safety in RL and provides a pioneer work for future safe RL studies.
Score: 27.88525130218356
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Safe reinforcement learning (RL) trains a policy to maximize the task reward while satisfying safety constraints. While prior works focus on the performance optimality, we find that the optimal solutions of many safe RL problems are not robust and safe against carefully designed observational perturbations. We formally analyze the unique properties of designing effective state adversarial attackers in the safe RL setting. We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL and proposed two new approaches - one maximizes the cost and the other maximizes the reward. One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward. We further propose a more effective adversarial training framework for safe RL and evaluate it via comprehensive experiments. This work sheds light on the inherited connection between observational robustness and safety in RL and provides a pioneer work for future safe RL studies.

Related papers

Probabilistic Shielding for Safe Reinforcement Learning [51.35559820893218]
In real-life scenarios, a Reinforcement Learning (RL) agent must often also behave in a safe manner, including at training time. We present a new, scalable method, which enjoys strict formal guarantees for Safe RL. We show that our approach provides a strict formal safety guarantee that the agent stays safe at training and test time.
arXiv Detail & Related papers (2025-03-09T17:54:33Z)
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z)
Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding [5.5929450570003185]
Training RL agents in unknown, black-box environments poses an even greater safety risk when prior knowledge of the domain/task is unavailable. We introduce ADVICE (Adaptive Shielding with a Contrastive Autoencoder), a novel post-shielding technique that distinguishes safe and unsafe features of state-action pairs during training.
arXiv Detail & Related papers (2024-05-28T13:47:21Z)
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation [26.244121960815907]
Managing the trade-off between reward and safety during exploration presents a significant challenge. In this study, we aim to address this conflicting relation by leveraging the theory of gradient manipulation. Experimental results demonstrate that our algorithms outperform several state-of-the-art baselines in terms of balancing reward and safety optimization.
arXiv Detail & Related papers (2024-05-02T19:07:14Z)
Safe Reinforcement Learning with Dual Robustness [10.455148541147796]
Reinforcement learning (RL) agents are vulnerable to adversarial disturbances. We propose a systematic framework to unify safe RL and robust RL. We also design a deep RL algorithm for practical implementation, called dually robust actor-critic (DRAC)
arXiv Detail & Related papers (2023-09-13T09:34:21Z)
Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies. Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system. We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z)
Safe and Sample-efficient Reinforcement Learning for Clustered Dynamic Environments [4.111899441919165]
This study proposes a safe and sample-efficient reinforcement learning (RL) framework to address two major challenges. We use the safe set algorithm (SSA) to monitor and modify the nominal controls, and evaluate SSA+RL in a clustered dynamic environment. Our framework can achieve better safety performance compare to other safe RL methods during training and solve the task with substantially fewer episodes.
arXiv Detail & Related papers (2023-03-24T20:29:17Z)
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z)
Provable Safe Reinforcement Learning with Binary Feedback [62.257383728544006]
We consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs. We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting.
arXiv Detail & Related papers (2022-10-26T05:37:51Z)
Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and Benchmarking [12.719948223824483]
reinforcement learning (RL) algorithms are crucial to unlock their potential for many real-world tasks. However, vanilla RL and most safe RL approaches do not guarantee safety. We introduce a categorization of existing provably safe RL methods, present the conceptual foundations for both continuous and discrete action spaces, and empirically benchmark existing methods. We provide practical guidance on selecting provably safe RL approaches depending on the safety specification, RL algorithm, and type of action space.
arXiv Detail & Related papers (2022-05-13T16:34:36Z)
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary [86.0846119254031]
We study the robustness of reinforcement learning with adversarially perturbed state observations. With a fixed agent policy, we demonstrate that an optimal adversary to perturb state observations can be found. For DRL settings, this leads to a novel empirical adversarial attack to RL agents via a learned adversary that is much stronger than previous ones.
arXiv Detail & Related papers (2021-01-21T05:38:52Z)
Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL) We learn a conservative safety estimate of environment states through a critic. We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.