Reinforcement Learning with Ensemble Model Predictive Safety
Certification
- URL: http://arxiv.org/abs/2402.04182v1
- Date: Tue, 6 Feb 2024 17:42:39 GMT
- Title: Reinforcement Learning with Ensemble Model Predictive Safety
Certification
- Authors: Sven Gronauer, Tom Haider, Felippe Schmoeller da Roza, Klaus Diepold
- Abstract summary: unsupervised exploration prevents the deployment of reinforcement learning algorithms on safety-critical tasks.
We propose a new algorithm that combines model-based deep reinforcement learning with tube-based model predictive control to correct the actions taken by a learning agent.
Our results show that we can achieve significantly fewer constraint violations than comparable reinforcement learning methods.
- Score: 2.658598582858331
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning algorithms need exploration to learn. However,
unsupervised exploration prevents the deployment of such algorithms on
safety-critical tasks and limits real-world deployment. In this paper, we
propose a new algorithm called Ensemble Model Predictive Safety Certification
that combines model-based deep reinforcement learning with tube-based model
predictive control to correct the actions taken by a learning agent, keeping
safety constraint violations at a minimum through planning. Our approach aims
to reduce the amount of prior knowledge about the actual system by requiring
only offline data generated by a safe controller. Our results show that we can
achieve significantly fewer constraint violations than comparable reinforcement
learning methods.
Related papers
- Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding.
We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Recursively Feasible Probabilistic Safe Online Learning with Control Barrier Functions [60.26921219698514]
We introduce a model-uncertainty-aware reformulation of CBF-based safety-critical controllers.
We then present the pointwise feasibility conditions of the resulting safety controller.
We use these conditions to devise an event-triggered online data collection strategy.
arXiv Detail & Related papers (2022-08-23T05:02:09Z) - Guided Safe Shooting: model based reinforcement learning with safety constraints [3.8490154494129327]
We introduce Guided Safe Shooting (GuSS), a model-based RL approach that can learn to control systems with minimal violations of the safety constraints.
We propose three different safe planners, one based on a simple random shooting strategy and two based on MAP-Elites, a more advanced divergent-search algorithm.
arXiv Detail & Related papers (2022-06-20T12:46:35Z) - Barrier Certified Safety Learning Control: When Sum-of-Square
Programming Meets Reinforcement Learning [0.0]
This work adopts control barrier functions over reinforcement learning, and proposes a compensated algorithm to completely maintain safety.
Compared to quadratic programming based reinforcement learning methods, our sum-of-squares programming based reinforcement learning has shown its superiority.
arXiv Detail & Related papers (2022-06-16T04:38:50Z) - Improving Safety in Deep Reinforcement Learning using Unsupervised
Action Planning [4.2955354157580325]
One of the key challenges to deep reinforcement learning (deep RL) is to ensure safety at both training and testing phases.
We propose a novel technique of unsupervised action planning to improve the safety of on-policy reinforcement learning algorithms.
Our results show that the proposed safety RL algorithm can achieve higher rewards compared with multiple baselines in both discrete and continuous control problems.
arXiv Detail & Related papers (2021-09-29T10:26:29Z) - Learning Barrier Certificates: Towards Safe Reinforcement Learning with
Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations.
We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z) - Chance-Constrained Trajectory Optimization for Safe Exploration and
Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training.
We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.