Control invariant set enhanced safe reinforcement learning: improved
sampling efficiency, guaranteed stability and robustness
- URL: http://arxiv.org/abs/2305.15602v1
- Date: Wed, 24 May 2023 22:22:19 GMT
- Title: Control invariant set enhanced safe reinforcement learning: improved
sampling efficiency, guaranteed stability and robustness
- Authors: Song Bo, Bernard T. Agyeman, Xunyuan Yin, Jinfeng Liu (University of
Alberta)
- Abstract summary: This work proposes a novel approach to RL training, called control invariant set (CIS) enhanced RL.
The robustness of the proposed approach is investigated in the presence of uncertainty.
Results show a significant improvement in sampling efficiency during offline training and closed-loop stability guarantee in the online implementation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) is an area of significant research interest, and
safe RL in particular is attracting attention due to its ability to handle
safety-driven constraints that are crucial for real-world applications. This
work proposes a novel approach to RL training, called control invariant set
(CIS) enhanced RL, which leverages the advantages of utilizing the explicit
form of CIS to improve stability guarantees and sampling efficiency.
Furthermore, the robustness of the proposed approach is investigated in the
presence of uncertainty. The approach consists of two learning stages: offline
and online. In the offline stage, CIS is incorporated into the reward design,
initial state sampling, and state reset procedures. This incorporation of CIS
facilitates improved sampling efficiency during the offline training process.
In the online stage, RL is retrained whenever the predicted next step state is
outside of the CIS, which serves as a stability criterion, by introducing a
Safety Supervisor to examine the safety of the action and make necessary
corrections. The stability analysis is conducted for both cases, with and
without uncertainty. To evaluate the proposed approach, we apply it to a
simulated chemical reactor. The results show a significant improvement in
sampling efficiency during offline training and closed-loop stability guarantee
in the online implementation, with and without uncertainty.
Related papers
- SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning [2.7075926292355286]
We propose SALSA-RL (Stability Analysis in the Latent Space of Actions), a novel RL framework that models control actions as dynamic, time-dependent variables evolving within a latent space.
By employing a pre-trained encoder-decoder and a state-dependent linear system, our approach enables both stability analysis and interpretability.
arXiv Detail & Related papers (2025-02-21T15:09:39Z) - Control invariant set enhanced reinforcement learning for process
control: improved sampling efficiency and guaranteed stability [0.0]
This work proposes a novel approach to RL training, called control invariant set (CIS) enhanced RL.
The approach consists of two learning stages: offline and online.
The results show a significant improvement in sampling efficiency during offline training and closed-loop stability in the online implementation.
arXiv Detail & Related papers (2023-04-11T21:27:36Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Safe Model-Based Reinforcement Learning with an Uncertainty-Aware
Reachability Certificate [6.581362609037603]
We build a safe reinforcement learning framework to resolve constraints required by the DRC and its corresponding shield policy.
We also devise a line search method to maintain safety and reach higher returns simultaneously while leveraging the shield policy.
arXiv Detail & Related papers (2022-10-14T06:16:53Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Lyapunov-based uncertainty-aware safe reinforcement learning [0.0]
InReinforcement learning (RL) has shown a promising performance in learning optimal policies for a variety of sequential decision-making tasks.
In many real-world RL problems, besides optimizing the main objectives, the agent is expected to satisfy a certain level of safety.
We propose a Lyapunov-based uncertainty-aware safe RL model to address these limitations.
arXiv Detail & Related papers (2021-07-29T13:08:15Z) - Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning [63.53407136812255]
Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.
Existing Q-learning and actor-critic based off-policy RL algorithms fail when bootstrapping from out-of-distribution (OOD) actions or states.
We propose Uncertainty Weighted Actor-Critic (UWAC), an algorithm that detects OOD state-action pairs and down-weights their contribution in the training objectives accordingly.
arXiv Detail & Related papers (2021-05-17T20:16:46Z) - Conservative Safety Critics for Exploration [120.73241848565449]
We study the problem of safe exploration in reinforcement learning (RL)
We learn a conservative safety estimate of environment states through a critic.
We show that the proposed approach can achieve competitive task performance while incurring significantly lower catastrophic failure rates.
arXiv Detail & Related papers (2020-10-27T17:54:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.