Related papers: A Survey of Constraint Formulations in Safe Reinforcement Learning

A Survey of Constraint Formulations in Safe Reinforcement Learning

URL: http://arxiv.org/abs/2402.02025v2
Date: Wed, 8 May 2024 00:59:16 GMT
Title: A Survey of Constraint Formulations in Safe Reinforcement Learning
Authors: Akifumi Wachi, Xun Shen, Yanan Sui,
Abstract summary: Safety is critical when applying reinforcement learning to real-world problems. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult.
Score: 15.593999581562203
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Safety is critical when applying reinforcement learning (RL) to real-world problems. As a result, safe RL has emerged as a fundamental and powerful paradigm for optimizing an agent's policy while incorporating notions of safety. A prevalent safe RL approach is based on a constrained criterion, which seeks to maximize the expected cumulative reward subject to specific safety constraints. Despite recent effort to enhance safety in RL, a systematic understanding of the field remains difficult. This challenge stems from the diversity of constraint representations and little exploration of their interrelations. To bridge this knowledge gap, we present a comprehensive review of representative constraint formulations, along with a curated selection of algorithms designed specifically for each formulation. In addition, we elucidate the theoretical underpinnings that reveal the mathematical mutual relations among common problem formulations. We conclude with a discussion of the current state and future directions of safe reinforcement learning research.

Related papers

Viability of Future Actions: Robust Safety in Reinforcement Learning via Entropy Regularization [47.30677525394649]
We analyze the interplay between two well-established techniques in model-free reinforcement learning: entropy regularization and constraints penalization.<n>We show that entropy regularization in constrained RL inherently biases learning toward maximizing the number of future viable actions, thereby promoting constraints satisfaction robust to action noise.<n>We conclude that the connection between entropy regularization and robustness is a promising avenue for further empirical and theoretical investigation.
arXiv Detail & Related papers (2025-06-12T16:34:19Z)
Advancing Neural Network Verification through Hierarchical Safety Abstract Interpretation [52.626086874715284]
We introduce a novel problem formulation called Abstract DNN-Verification, which verifies a hierarchical structure of unsafe outputs.<n>By leveraging abstract interpretation and reasoning about output reachable sets, our approach enables assessing multiple safety levels during the formal verification process.<n>Our contributions include a theoretical exploration of the relationship between our novel abstract safety formulation and existing approaches.
arXiv Detail & Related papers (2025-05-08T13:29:46Z)
Latent Safety-Constrained Policy Approach for Safe Offline Reinforcement Learning [7.888219789657414]
In safe offline reinforcement learning (RL), the objective is to develop a policy that maximizes cumulative rewards while strictly adhering to safety constraints. We address these issues with a novel approach that begins by learning a conservatively safe policy through the use of Conditional Variational Autoencoders. We frame this as a Constrained Reward-Return Maximization problem, wherein the policy aims to optimize rewards while complying with the inferred latent safety constraints.
arXiv Detail & Related papers (2024-12-11T22:00:07Z)
Feasibility Consistent Representation Learning for Safe Reinforcement Learning [25.258227763316228]
We introduce a novel framework named Feasibility Consistent Safe Reinforcement Learning (FCSRL) This framework combines representation learning with feasibility-oriented objectives to identify and extract safety-related information from the raw state for safe RL. Our method is capable of learning a better safety-aware embedding and achieving superior performance than previous representation learning baselines.
arXiv Detail & Related papers (2024-05-20T01:37:21Z)
Concurrent Learning of Policy and Unknown Safety Constraints in Reinforcement Learning [4.14360329494344]
Reinforcement learning (RL) has revolutionized decision-making across a wide range of domains over the past few decades. Yet, deploying RL policies in real-world scenarios presents the crucial challenge of ensuring safety. Traditional safe RL approaches have predominantly focused on incorporating predefined safety constraints into the policy learning process. We propose a novel approach that concurrently learns a safe RL control policy and identifies the unknown safety constraint parameters of a given environment.
arXiv Detail & Related papers (2024-02-24T20:01:15Z)
Resilient Constrained Reinforcement Learning [87.4374430686956]
We study a class of constrained reinforcement learning (RL) problems in which multiple constraint specifications are not identified before study. It is challenging to identify appropriate constraint specifications due to the undefined trade-off between the reward training objective and the constraint satisfaction. We propose a new constrained RL approach that searches for policy and constraint specifications together.
arXiv Detail & Related papers (2023-12-28T18:28:23Z)
Gradient Shaping for Multi-Constraint Safe Reinforcement Learning [31.297400160104853]
Online safe reinforcement learning (RL) involves training a policy that maximizes task efficiency while satisfying constraints via interacting with the environments. We propose a unified framework designed for MC safe RL algorithms. We introduce the Gradient Shaping (GradS) method for general Lagrangian-based safe RL algorithms to improve the training efficiency in terms of both reward and constraint satisfaction.
arXiv Detail & Related papers (2023-12-23T00:55:09Z)
Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis [63.532413807686524]
This paper addresses the problem of maintaining safety during training in Reinforcement Learning (RL) We propose a new architecture that handles the trade-off between efficient progress and safety during exploration.
arXiv Detail & Related papers (2023-12-18T16:09:43Z)
Joint Learning of Policy with Unknown Temporal Constraints for Safe Reinforcement Learning [0.0]
We propose a framework that concurrently learns safety constraints and optimal RL policies. The framework is underpinned by theorems that establish the convergence of our joint learning process. We showcased our framework in grid-world environments, successfully identifying both acceptable safety constraints and RL policies.
arXiv Detail & Related papers (2023-04-30T21:15:07Z)
State-wise Safe Reinforcement Learning: A Survey [5.826308050755618]
State-wise constraints are one of the most common constraints in real-world applications. This paper provides a review of existing approaches that address state-wise constraints in RL.
arXiv Detail & Related papers (2023-02-06T21:11:29Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Reinforcement Learning with Stepwise Fairness Constraints [50.538878453547966]
We introduce the study of reinforcement learning with stepwise fairness constraints. We provide learning algorithms with strong theoretical guarantees in regard to policy optimality and fairness violation.
arXiv Detail & Related papers (2022-11-08T04:06:23Z)
Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z)
Cautious Reinforcement Learning with Logical Constraints [78.96597639789279]
An adaptive safe padding forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm.
arXiv Detail & Related papers (2020-02-26T00:01:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.