Specialized Deep Residual Policy Safe Reinforcement Learning-Based
Controller for Complex and Continuous State-Action Spaces
- URL: http://arxiv.org/abs/2310.14788v1
- Date: Sun, 15 Oct 2023 21:53:23 GMT
- Title: Specialized Deep Residual Policy Safe Reinforcement Learning-Based
Controller for Complex and Continuous State-Action Spaces
- Authors: Ammar N. Abbas, Georgios C. Chasparis, and John D. Kelleher
- Abstract summary: It is impractical to explore randomly, and replacing conventional controllers with black-box models is also undesirable.
We propose a specialized deep residual policy safe reinforcement learning with a cycle of learning approach adapted for complex and continuous state-action spaces.
- Score: 1.3678669691302048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional controllers have limitations as they rely on prior knowledge
about the physics of the problem, require modeling of dynamics, and struggle to
adapt to abnormal situations. Deep reinforcement learning has the potential to
address these problems by learning optimal control policies through exploration
in an environment. For safety-critical environments, it is impractical to
explore randomly, and replacing conventional controllers with black-box models
is also undesirable. Also, it is expensive in continuous state and action
spaces, unless the search space is constrained. To address these challenges we
propose a specialized deep residual policy safe reinforcement learning with a
cycle of learning approach adapted for complex and continuous state-action
spaces. Residual policy learning allows learning a hybrid control architecture
where the reinforcement learning agent acts in synchronous collaboration with
the conventional controller. The cycle of learning initiates the policy through
the expert trajectory and guides the exploration around it. Further, the
specialization through the input-output hidden Markov model helps to optimize
policy that lies within the region of interest (such as abnormality), where the
reinforcement learning agent is required and is activated. The proposed
solution is validated on the Tennessee Eastman process control.
Related papers
- RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Integrating DeepRL with Robust Low-Level Control in Robotic Manipulators for Non-Repetitive Reaching Tasks [0.24578723416255746]
In robotics, contemporary strategies are learning-based, characterized by a complex black-box nature and a lack of interpretability.
We propose integrating a collision-free trajectory planner based on deep reinforcement learning (DRL) with a novel auto-tuning low-level control strategy.
arXiv Detail & Related papers (2024-02-04T15:54:03Z) - Hierarchical Framework for Interpretable and Probabilistic Model-Based
Safe Reinforcement Learning [1.3678669691302048]
This paper proposes a novel approach for the use of deep reinforcement learning in safety-critical systems.
It combines the advantages of probabilistic modeling and reinforcement learning with the added benefits of interpretability.
arXiv Detail & Related papers (2023-10-28T20:30:57Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Learning Variable Impedance Control for Aerial Sliding on Uneven
Heterogeneous Surfaces by Proprioceptive and Tactile Sensing [42.27572349747162]
We present a learning-based adaptive control strategy for aerial sliding tasks.
The proposed controller structure combines data-driven and model-based control methods.
Compared to fine-tuned state of the art interaction control methods we achieve reduced tracking error and improved disturbance rejection.
arXiv Detail & Related papers (2022-06-28T16:28:59Z) - Learning Robust Policy against Disturbance in Transition Dynamics via
State-Conservative Policy Optimization [63.75188254377202]
Deep reinforcement learning algorithms can perform poorly in real-world tasks due to discrepancy between source and target environments.
We propose a novel model-free actor-critic algorithm to learn robust policies without modeling the disturbance in advance.
Experiments in several robot control tasks demonstrate that SCPO learns robust policies against the disturbance in transition dynamics.
arXiv Detail & Related papers (2021-12-20T13:13:05Z) - Learning Barrier Certificates: Towards Safe Reinforcement Learning with
Zero Training-time Violations [64.39401322671803]
This paper explores the possibility of safe RL algorithms with zero training-time safety violations.
We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies.
arXiv Detail & Related papers (2021-08-04T04:59:05Z) - Residual Feedback Learning for Contact-Rich Manipulation Tasks with
Uncertainty [22.276925045008788]
emphglsrpl offers a formulation to improve existing controllers with reinforcement learning (RL)
We show superior performance of our approach on a contact-rich peg-insertion task under position and orientation uncertainty.
arXiv Detail & Related papers (2021-06-08T13:06:35Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.