Hierarchical Framework for Interpretable and Probabilistic Model-Based
Safe Reinforcement Learning
- URL: http://arxiv.org/abs/2310.18811v1
- Date: Sat, 28 Oct 2023 20:30:57 GMT
- Title: Hierarchical Framework for Interpretable and Probabilistic Model-Based
Safe Reinforcement Learning
- Authors: Ammar N. Abbas, Georgios C. Chasparis, and John D. Kelleher
- Abstract summary: This paper proposes a novel approach for the use of deep reinforcement learning in safety-critical systems.
It combines the advantages of probabilistic modeling and reinforcement learning with the added benefits of interpretability.
- Score: 1.3678669691302048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The difficulty of identifying the physical model of complex systems has led
to exploring methods that do not rely on such complex modeling of the systems.
Deep reinforcement learning has been the pioneer for solving this problem
without the need for relying on the physical model of complex systems by just
interacting with it. However, it uses a black-box learning approach that makes
it difficult to be applied within real-world and safety-critical systems
without providing explanations of the actions derived by the model.
Furthermore, an open research question in deep reinforcement learning is how to
focus the policy learning of critical decisions within a sparse domain. This
paper proposes a novel approach for the use of deep reinforcement learning in
safety-critical systems. It combines the advantages of probabilistic modeling
and reinforcement learning with the added benefits of interpretability and
works in collaboration and synchronization with conventional decision-making
strategies. The BC-SRLA is activated in specific situations which are
identified autonomously through the fused information of probabilistic model
and reinforcement learning, such as abnormal conditions or when the system is
near-to-failure. Further, it is initialized with a baseline policy using policy
cloning to allow minimum interactions with the environment to address the
challenges associated with using RL in safety-critical industries. The
effectiveness of the BC-SRLA is demonstrated through a case study in
maintenance applied to turbofan engines, where it shows superior performance to
the prior art and other baselines.
Related papers
- Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions [2.50194939587674]
We propose a new Model-based RL framework to enable efficient policy learning with unknown dynamics.
We introduce and explore a novel method for adding safety constraints for model-based RL during training and policy learning.
arXiv Detail & Related papers (2024-05-25T11:21:12Z) - Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid [39.58317527488534]
We present the work in progress towards a hybrid agent architecture that combines model-based Deep Reinforcement Learning with imitation learning to overcome both problems.
In this paper, we present the work in progress towards a hybrid agent architecture that combines model-based Deep Reinforcement Learning with imitation learning to overcome both problems.
arXiv Detail & Related papers (2024-04-02T09:55:30Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of
Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning.
In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z) - Adaptive Decision Making at the Intersection for Autonomous Vehicles
Based on Skill Discovery [13.134487965031667]
In urban environments, the complex and uncertain intersection scenarios are challenging for autonomous driving.
To ensure safety, it is crucial to develop an adaptive decision making system that can handle the interaction with other vehicles.
We propose a hierarchical framework that can autonomously accumulate and reuse knowledge.
arXiv Detail & Related papers (2022-07-24T11:56:45Z) - Verified Probabilistic Policies for Deep Reinforcement Learning [6.85316573653194]
We tackle the problem of verifying probabilistic policies for deep reinforcement learning.
We propose an abstraction approach, based on interval Markov decision processes, that yields guarantees on a policy's execution.
We present techniques to build and solve these models using abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking.
arXiv Detail & Related papers (2022-01-10T23:55:04Z) - Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic
through Gaussian Processes and Control Barrier Functions [3.5897534810405403]
Reinforcement learning (RL) is a promising approach and has limited success towards real-world applications.
In this paper, we propose a learning-based control framework consisting of several aspects.
We show such an ECBF-based modular deep RL algorithm achieves near-perfect success rates and guard safety with a high probability.
arXiv Detail & Related papers (2021-09-07T00:51:12Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.