Related papers: SafeDreamer: Safe Reinforcement Learning with World Models

SafeDreamer: Safe Reinforcement Learning with World Models

URL: http://arxiv.org/abs/2307.07176v3
Date: Wed, 7 Aug 2024 19:08:36 GMT
Title: SafeDreamer: Safe Reinforcement Learning with World Models
Authors: Weidong Huang, Jiaming Ji, Chunhe Xia, Borong Zhang, Yaodong Yang,
Abstract summary: We introduce SafeDreamer, a novel algorithm incorporating Lagrangian-based methods into world model planning processes. Our method achieves nearly zero-cost performance on various tasks, spanning low-dimensional and vision-only input.
Score: 7.773096110271637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The deployment of Reinforcement Learning (RL) in real-world applications is constrained by its failure to satisfy safety criteria. Existing Safe Reinforcement Learning (SafeRL) methods, which rely on cost functions to enforce safety, often fail to achieve zero-cost performance in complex scenarios, especially vision-only tasks. These limitations are primarily due to model inaccuracies and inadequate sample efficiency. The integration of the world model has proven effective in mitigating these shortcomings. In this work, we introduce SafeDreamer, a novel algorithm incorporating Lagrangian-based methods into world model planning processes within the superior Dreamer framework. Our method achieves nearly zero-cost performance on various tasks, spanning low-dimensional and vision-only input, within the Safety-Gymnasium benchmark, showcasing its efficacy in balancing performance and safety in RL tasks. Further details can be found in the code repository: \url{https://github.com/PKU-Alignment/SafeDreamer}.

Related papers

Saffron-1: Safety Inference Scaling [69.61130284742353]
SAFFRON is a novel inference scaling paradigm tailored explicitly for safety assurance.<n>Central to our approach is the introduction of a multifurcation reward model (MRM) that significantly reduces the required number of reward model evaluations.<n>We publicly release our trained multifurcation reward model (Saffron-1) and the accompanying token-level safety reward dataset (Safety4M)
arXiv Detail & Related papers (2025-06-06T18:05:45Z)
ReGA: Representation-Guided Abstraction for Model-based Safeguarding of LLMs [0.9285458070502282]
Large Language Models (LLMs) have achieved significant success in various tasks, yet concerns about their safety and security have emerged.<n>To analyze and monitor machine learning models, model-based analysis has demonstrated notable potential in stateful deep neural networks.<n>We propose ReGA, a model-based analysis framework with representation-guided abstraction, to safeguard LLMs against harmful prompts and generations.
arXiv Detail & Related papers (2025-06-02T15:17:38Z)
Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning [5.593642806259113]
We model a meta-learning process where each task is synchronized with a safeguard that monitors safety and provides a reward signal to the agent. The design of the safeguard is manual but it is high-level and model-agnostic, which gives rise to an end-to-end safe learning approach. We evaluate our framework in a Minecraft-inspired Gridworld, a VizDoom game environment, and an LLM fine-tuning application.
arXiv Detail & Related papers (2024-10-31T16:28:33Z)
ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning [48.536695794883826]
We present ActSafe, a novel model-based RL algorithm for safe and efficient exploration. We show that ActSafe guarantees safety during learning while also obtaining a near-optimal policy in finite time. In addition, we propose a practical variant of ActSafe that builds on latest model-based RL advancements.
arXiv Detail & Related papers (2024-10-12T10:46:02Z)
FOSP: Fine-tuning Offline Safe Policy through World Models [3.7971075341023526]
Model-based Reinforcement Learning (RL) has shown its high training efficiency and capability of handling high-dimensional tasks. However, prior works still pose safety challenges due to the online exploration in real-world deployment. In this paper, we aim to further enhance safety during the deployment stage for vision-based robotic tasks by fine-tuning an offline-trained policy.
arXiv Detail & Related papers (2024-07-06T03:22:57Z)
Safety through Permissibility: Shield Construction for Fast and Safe Reinforcement Learning [57.84059344739159]
"Shielding" is a popular technique to enforce safety inReinforcement Learning (RL) We propose a new permissibility-based framework to deal with safety and shield construction.
arXiv Detail & Related papers (2024-05-29T18:00:21Z)
Reinforcement Learning in a Safety-Embedded MDP with Trajectory Optimization [42.258173057389]
This work introduces a novel approach that combines RL with trajectory optimization to manage this trade-off effectively. Our approach embeds safety constraints within the action space of a modified Markov Decision Process (MDP) The novel approach excels in its performance on challenging Safety Gym tasks, achieving significantly higher rewards and near-zero safety violations during inference.
arXiv Detail & Related papers (2023-10-10T18:01:16Z)
Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies. Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system. We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Constrained Policy Optimization via Bayesian World Models [79.0077602277004]
LAMBDA is a model-based approach for policy optimization in safety critical tasks modeled via constrained Markov decision processes. We demonstrate LAMBDA's state of the art performance on the Safety-Gym benchmark suite in terms of sample efficiency and constraint violation.
arXiv Detail & Related papers (2022-01-24T17:02:22Z)
Safe Model-Based Reinforcement Learning Using Robust Control Barrier Functions [43.713259595810854]
An increasingly common approach to address safety involves the addition of a safety layer that projects the RL actions onto a safe set of actions. In this paper, we frame safety as a differentiable robust-control-barrier-function layer in a model-based RL framework.
arXiv Detail & Related papers (2021-10-11T17:00:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.