Control invariant set enhanced reinforcement learning for process
control: improved sampling efficiency and guaranteed stability
- URL: http://arxiv.org/abs/2304.05509v1
- Date: Tue, 11 Apr 2023 21:27:36 GMT
- Title: Control invariant set enhanced reinforcement learning for process
control: improved sampling efficiency and guaranteed stability
- Authors: Song Bo, Xunyuan Yin, Jinfeng Liu (University of Alberta)
- Abstract summary: This work proposes a novel approach to RL training, called control invariant set (CIS) enhanced RL.
The approach consists of two learning stages: offline and online.
The results show a significant improvement in sampling efficiency during offline training and closed-loop stability in the online implementation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement learning (RL) is an area of significant research interest, and
safe RL in particular is attracting attention due to its ability to handle
safety-driven constraints that are crucial for real-world applications of RL
algorithms. This work proposes a novel approach to RL training, called control
invariant set (CIS) enhanced RL, which leverages the benefits of CIS to improve
stability guarantees and sampling efficiency. The approach consists of two
learning stages: offline and online. In the offline stage, CIS is incorporated
into the reward design, initial state sampling, and state reset procedures. In
the online stage, RL is retrained whenever the state is outside of CIS, which
serves as a stability criterion. A backup table that utilizes the explicit form
of CIS is obtained to ensure the online stability. To evaluate the proposed
approach, we apply it to a simulated chemical reactor. The results show a
significant improvement in sampling efficiency during offline training and
closed-loop stability in the online implementation.
Related papers
- SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning [2.7075926292355286]
We propose SALSA-RL (Stability Analysis in the Latent Space of Actions), a novel RL framework that models control actions as dynamic, time-dependent variables evolving within a latent space.
By employing a pre-trained encoder-decoder and a state-dependent linear system, our approach enables both stability analysis and interpretability.
arXiv Detail & Related papers (2025-02-21T15:09:39Z) - Reward-Safety Balance in Offline Safe RL via Diffusion Regularization [16.5825143820431]
Constrained reinforcement learning (RL) seeks high-performance policies under safety constraints.
We propose Diffusion-Regularized Constrained Offline Reinforcement Learning (DRCORL)
DRCORL first uses a diffusion model to capture the behavioral policy from offline data and then extracts a simplified policy to enable efficient inference.
arXiv Detail & Related papers (2025-02-18T00:00:03Z) - SPEQ: Offline Stabilization Phases for Efficient Q-Learning in High Update-To-Data Ratio Reinforcement Learning [51.10866035483686]
High update-to-data (UTD) ratio algorithms in reinforcement learning (RL) improve sample efficiency but incur high computational costs, limiting real-world scalability.
We propose Offline Stabilization Phases for Efficient Q-Learning (SPEQ), an RL algorithm that combines low-UTD online training with periodic offline stabilization phases.
During these phases, Q-functions are fine-tuned with high UTD ratios on a fixed replay buffer, reducing redundant updates on suboptimal data.
arXiv Detail & Related papers (2025-01-15T09:04:19Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Control invariant set enhanced safe reinforcement learning: improved
sampling efficiency, guaranteed stability and robustness [0.0]
This work proposes a novel approach to RL training, called control invariant set (CIS) enhanced RL.
The robustness of the proposed approach is investigated in the presence of uncertainty.
Results show a significant improvement in sampling efficiency during offline training and closed-loop stability guarantee in the online implementation.
arXiv Detail & Related papers (2023-05-24T22:22:19Z) - Robust Reinforcement Learning in Continuous Control Tasks with
Uncertainty Set Regularization [17.322284328945194]
Reinforcement learning (RL) is recognized as lacking generalization and robustness under environmental perturbations.
We propose a new regularizer named $textbfU$ncertainty $textbfS$et $textbfR$egularizer (USR)
arXiv Detail & Related papers (2022-07-05T12:56:08Z) - Safe Reinforcement Learning via Confidence-Based Filters [78.39359694273575]
We develop a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard reinforcement learning techniques.
We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-07-04T11:43:23Z) - RORL: Robust Offline Reinforcement Learning via Conservative Smoothing [72.8062448549897]
offline reinforcement learning can exploit the massive amount of offline data for complex decision-making tasks.
Current offline RL algorithms are generally designed to be conservative for value estimation and action selection.
We propose Robust Offline Reinforcement Learning (RORL) with a novel conservative smoothing technique.
arXiv Detail & Related papers (2022-06-06T18:07:41Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning [63.53407136812255]
Offline Reinforcement Learning promises to learn effective policies from previously-collected, static datasets without the need for exploration.
Existing Q-learning and actor-critic based off-policy RL algorithms fail when bootstrapping from out-of-distribution (OOD) actions or states.
We propose Uncertainty Weighted Actor-Critic (UWAC), an algorithm that detects OOD state-action pairs and down-weights their contribution in the training objectives accordingly.
arXiv Detail & Related papers (2021-05-17T20:16:46Z) - Reinforcement Learning Control of Constrained Dynamic Systems with
Uniformly Ultimate Boundedness Stability Guarantee [12.368097742148128]
Reinforcement learning (RL) is promising for complicated nonlinear control problems.
The data-based learning approach is notorious for not guaranteeing stability, which is the most fundamental property for any control system.
In this paper, the classic Lyapunov's method is explored to analyze the uniformly ultimate boundedness stability (UUB) solely based on data.
arXiv Detail & Related papers (2020-11-13T12:41:56Z) - Remote Electrical Tilt Optimization via Safe Reinforcement Learning [1.2599533416395765]
Remote Electrical Tilt (RET) optimization is an efficient method for adjusting the vertical tilt angle of Base Stations (BSs) antennas in order to optimize Key Performance Indicators (KPIs) of the network.
In this work, we model the RET optimization problem in the Safe Reinforcement Learning (SRL) framework with the goal of learning a tilt control strategy.
Our experiments show that the proposed approach is able to learn a safe and improved tilt update policy, providing a higher degree of reliability and potential for real-world network deployment.
arXiv Detail & Related papers (2020-10-12T16:46:40Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.