Related papers: Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning

Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning

URL: http://arxiv.org/abs/2310.18811v1
Date: Sat, 28 Oct 2023 20:30:57 GMT
Title: Hierarchical Framework for Interpretable and Probabilistic Model-Based Safe Reinforcement Learning
Authors: Ammar N. Abbas, Georgios C. Chasparis, and John D. Kelleher
Abstract summary: This paper proposes a novel approach for the use of deep reinforcement learning in safety-critical systems. It combines the advantages of probabilistic modeling and reinforcement learning with the added benefits of interpretability.
Score: 1.3678669691302048
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The difficulty of identifying the physical model of complex systems has led to exploring methods that do not rely on such complex modeling of the systems. Deep reinforcement learning has been the pioneer for solving this problem without the need for relying on the physical model of complex systems by just interacting with it. However, it uses a black-box learning approach that makes it difficult to be applied within real-world and safety-critical systems without providing explanations of the actions derived by the model. Furthermore, an open research question in deep reinforcement learning is how to focus the policy learning of critical decisions within a sparse domain. This paper proposes a novel approach for the use of deep reinforcement learning in safety-critical systems. It combines the advantages of probabilistic modeling and reinforcement learning with the added benefits of interpretability and works in collaboration and synchronization with conventional decision-making strategies. The BC-SRLA is activated in specific situations which are identified autonomously through the fused information of probabilistic model and reinforcement learning, such as abnormal conditions or when the system is near-to-failure. Further, it is initialized with a baseline policy using policy cloning to allow minimum interactions with the environment to address the challenges associated with using RL in safety-critical industries. The effectiveness of the BC-SRLA is demonstrated through a case study in maintenance applied to turbofan engines, where it shows superior performance to the prior art and other baselines.

Related papers

Model-free Reinforcement Learning for Model-based Control: Towards Safe, Interpretable and Sample-efficient Agents [6.9290255098776425]
This work introduces model-based agents as a compelling alternative for control policy approximation.<n>These models can encode prior system knowledge to inform, constrain, and aid in explaining the agent's decisions.<n>We outline the benefits and challenges of learning model-based agents.
arXiv Detail & Related papers (2025-07-17T18:59:54Z)
An Intelligent Fault Self-Healing Mechanism for Cloud AI Systems via Integration of Large Language Models and Deep Reinforcement Learning [1.1149781202731994]
We propose an Intelligent Fault Self-Healing Mechanism (IFSHM) that integrates Large Language Model (LLM) and Deep Reinforcement Learning (DRL)<n>IFSHM aims to realize a fault recovery framework with semantic understanding and policy optimization capabilities in cloud AI systems.<n> Experimental results on the cloud fault injection platform show that compared with the existing DRL and rule methods, the IFSHM framework shortens the system recovery time by 37% with unknown fault scenarios.
arXiv Detail & Related papers (2025-06-09T04:14:05Z)
Robo-taxi Fleet Coordination at Scale via Reinforcement Learning [21.266509380044912]
This work introduces a novel decision-making framework that unites mathematical modeling with data-driven techniques. We present the AMoD coordination problem through the lens of reinforcement learning and propose a graph network-based framework. In particular, we present the AMoD coordination problem through the lens of reinforcement learning and propose a graph network-based framework.
arXiv Detail & Related papers (2025-04-08T15:19:41Z)
Vintix: Action Model via In-Context Reinforcement Learning [72.65703565352769]
We present the first steps toward scaling ICRL by introducing a fixed, cross-domain model capable of learning behaviors through in-context reinforcement learning. Our results demonstrate that Algorithm Distillation, a framework designed to facilitate ICRL, offers a compelling and competitive alternative to expert distillation to construct versatile action models.
arXiv Detail & Related papers (2025-01-31T18:57:08Z)
Q-learning-based Model-free Safety Filter [6.391687991642366]
This paper proposes a simple, plugin-and-play, and effective model-free safety filter learning framework. We introduce a novel reward formulation and use Q-learning to learn Q-value functions to safeguard arbitrary task specific nominal policies.
arXiv Detail & Related papers (2024-11-29T16:16:59Z)
Safe Deep Model-Based Reinforcement Learning with Lyapunov Functions [2.50194939587674]
We propose a new Model-based RL framework to enable efficient policy learning with unknown dynamics. We introduce and explore a novel method for adding safety constraints for model-based RL during training and policy learning.
arXiv Detail & Related papers (2024-05-25T11:21:12Z)
Imitation Game: A Model-based and Imitation Learning Deep Reinforcement Learning Hybrid [39.58317527488534]
We present the work in progress towards a hybrid agent architecture that combines model-based Deep Reinforcement Learning with imitation learning to overcome both problems. In this paper, we present the work in progress towards a hybrid agent architecture that combines model-based Deep Reinforcement Learning with imitation learning to overcome both problems.
arXiv Detail & Related papers (2024-04-02T09:55:30Z)
Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning. We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models [103.71308117592963]
We present an algorithm for training self-destructing models leveraging techniques from meta-learning and adversarial learning. In a small-scale experiment, we show MLAC can largely prevent a BERT-style model from being re-purposed to perform gender identification.
arXiv Detail & Related papers (2022-11-27T21:43:45Z)
Adaptive Decision Making at the Intersection for Autonomous Vehicles Based on Skill Discovery [13.134487965031667]
In urban environments, the complex and uncertain intersection scenarios are challenging for autonomous driving. To ensure safety, it is crucial to develop an adaptive decision making system that can handle the interaction with other vehicles. We propose a hierarchical framework that can autonomously accumulate and reuse knowledge.
arXiv Detail & Related papers (2022-07-24T11:56:45Z)
Verified Probabilistic Policies for Deep Reinforcement Learning [6.85316573653194]
We tackle the problem of verifying probabilistic policies for deep reinforcement learning. We propose an abstraction approach, based on interval Markov decision processes, that yields guarantees on a policy's execution. We present techniques to build and solve these models using abstract interpretation, mixed-integer linear programming, entropy-based refinement and probabilistic model checking.
arXiv Detail & Related papers (2022-01-10T23:55:04Z)
Safe-Critical Modular Deep Reinforcement Learning with Temporal Logic through Gaussian Processes and Control Barrier Functions [3.5897534810405403]
Reinforcement learning (RL) is a promising approach and has limited success towards real-world applications. In this paper, we propose a learning-based control framework consisting of several aspects. We show such an ECBF-based modular deep RL algorithm achieves near-perfect success rates and guard safety with a high probability.
arXiv Detail & Related papers (2021-09-07T00:51:12Z)
Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem. We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z)
Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state. reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle. In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL. We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.