Related papers: Reinforcement Learning with Ensemble Model Predictive Safety Certification

Reinforcement Learning with Ensemble Model Predictive Safety Certification

URL: http://arxiv.org/abs/2402.04182v1
Date: Tue, 6 Feb 2024 17:42:39 GMT
Title: Reinforcement Learning with Ensemble Model Predictive Safety Certification
Authors: Sven Gronauer, Tom Haider, Felippe Schmoeller da Roza, Klaus Diepold
Abstract summary: unsupervised exploration prevents the deployment of reinforcement learning algorithms on safety-critical tasks. We propose a new algorithm that combines model-based deep reinforcement learning with tube-based model predictive control to correct the actions taken by a learning agent. Our results show that we can achieve significantly fewer constraint violations than comparable reinforcement learning methods.
Score: 2.658598582858331
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reinforcement learning algorithms need exploration to learn. However, unsupervised exploration prevents the deployment of such algorithms on safety-critical tasks and limits real-world deployment. In this paper, we propose a new algorithm called Ensemble Model Predictive Safety Certification that combines model-based deep reinforcement learning with tube-based model predictive control to correct the actions taken by a learning agent, keeping safety constraint violations at a minimum through planning. Our approach aims to reduce the amount of prior knowledge about the actual system by requiring only offline data generated by a safe controller. Our results show that we can achieve significantly fewer constraint violations than comparable reinforcement learning methods.

Related papers

Certificated Actor-Critic: Hierarchical Reinforcement Learning with Control Barrier Functions for Safe Navigation [10.177896903517546]
Control Barrier Functions (CBFs) have emerged as a prominent approach to designing safe navigation systems of robots. We present a new model-free reinforcement learning algorithm called Certificated Actor-Critic (CAC) CAC introduces a hierarchical reinforcement learning framework and well-defined reward functions derived from CBFs.
arXiv Detail & Related papers (2025-01-29T05:37:47Z)
Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies. Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system. We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z)
Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding. We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers. We then present the pointwise feasibility conditions of the resulting safety controller. We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z)
Guided Safe Shooting: model based reinforcement learning with safety constraints [3.8490154494129327]
We introduce Guided Safe Shooting (GuSS), a model-based RL approach that can learn to control systems with minimal violations of the safety constraints. We propose three different safe planners, one based on a simple random shooting strategy and two based on MAP-Elites, a more advanced divergent-search algorithm.
arXiv Detail & Related papers (2022-06-20T12:46:35Z)
Barrier Certified Safety Learning Control: When Sum-of-Square Programming Meets Reinforcement Learning [0.0]
This work adopts control barrier functions over reinforcement learning, and proposes a compensated algorithm to completely maintain safety. Compared to quadratic programming based reinforcement learning methods, our sum-of-squares programming based reinforcement learning has shown its superiority.
arXiv Detail & Related papers (2022-06-16T04:38:50Z)
Improving Safety in Deep Reinforcement Learning using Unsupervised Action Planning [4.2955354157580325]
One of the key challenges to deep reinforcement learning (deep RL) is to ensure safety at both training and testing phases. We propose a novel technique of unsupervised action planning to improve the safety of on-policy reinforcement learning algorithms. Our results show that the proposed safety RL algorithm can achieve higher rewards compared with multiple baselines in both discrete and continuous control problems.
arXiv Detail & Related papers (2021-09-29T10:26:29Z)
Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations. We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z)
Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.