Two-step reinforcement learning for model-free redesign of nonlinear
optimal regulator
- URL: http://arxiv.org/abs/2103.03808v4
- Date: Thu, 30 Nov 2023 18:07:34 GMT
- Title: Two-step reinforcement learning for model-free redesign of nonlinear
optimal regulator
- Authors: Mei Minami, Yuka Masumoto, Yoshihiro Okawa, Tomotake Sasaki, Yutaka
Hori
- Abstract summary: Reinforcement learning (RL) is one of the promising approaches that enable model-free redesign of optimal controllers for nonlinear dynamical systems.
We propose a model-free two-step design approach that improves the transient learning performance of RL in an optimal regulator redesign problem for unknown nonlinear systems.
- Score: 1.5624421399300306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many practical control applications, the performance level of a
closed-loop system degrades over time due to the change of plant
characteristics. Thus, there is a strong need for redesigning a controller
without going through the system modeling process, which is often difficult for
closed-loop systems. Reinforcement learning (RL) is one of the promising
approaches that enable model-free redesign of optimal controllers for nonlinear
dynamical systems based only on the measurement of the closed-loop system.
However, the learning process of RL usually requires a considerable number of
trial-and-error experiments using the poorly controlled system that may
accumulate wear on the plant. To overcome this limitation, we propose a
model-free two-step design approach that improves the transient learning
performance of RL in an optimal regulator redesign problem for unknown
nonlinear systems. Specifically, we first design a linear control law that
attains some degree of control performance in a model-free manner, and then,
train the nonlinear optimal control law with online RL by using the designed
linear control law in parallel. We introduce an offline RL algorithm for the
design of the linear control law and theoretically guarantee its convergence to
the LQR controller under mild assumptions. Numerical simulations show that the
proposed approach improves the transient learning performance and efficiency in
hyperparameter tuning of RL.
Related papers
- Model-Free Load Frequency Control of Nonlinear Power Systems Based on
Deep Reinforcement Learning [29.643278858113266]
This paper proposes a model-free LFC method for nonlinear power systems based on deep deterministic policy gradient (DDPG) framework.
The controller can generate appropriate control actions and has strong adaptability for nonlinear power systems.
arXiv Detail & Related papers (2024-03-07T10:06:46Z) - Optimal Exploration for Model-Based RL in Nonlinear Systems [14.540210895533937]
Learning to control unknown nonlinear dynamical systems is a fundamental problem in reinforcement learning and control theory.
We develop an algorithm able to efficiently explore the system to reduce uncertainty in a task-dependent metric.
Our algorithm relies on a general reduction from policy optimization to optimal experiment design in arbitrary systems, and may be of independent interest.
arXiv Detail & Related papers (2023-06-15T15:47:50Z) - Bridging Model-based Safety and Model-free Reinforcement Learning
through System Identification of Low Dimensional Linear Models [16.511440197186918]
We propose a new method to combine model-based safety with model-free reinforcement learning.
We show that a low-dimensional dynamical model is sufficient to capture the dynamics of the closed-loop system.
We illustrate that the found linear model is able to provide guarantees by safety-critical optimal control framework.
arXiv Detail & Related papers (2022-05-11T22:03:18Z) - Comparative analysis of machine learning methods for active flow control [60.53767050487434]
Genetic Programming (GP) and Reinforcement Learning (RL) are gaining popularity in flow control.
This work presents a comparative analysis of the two, bench-marking some of their most representative algorithms against global optimization techniques.
arXiv Detail & Related papers (2022-02-23T18:11:19Z) - Deep Learning Explicit Differentiable Predictive Control Laws for
Buildings [1.4121977037543585]
We present a differentiable predictive control (DPC) methodology for learning constrained control laws for unknown nonlinear systems.
DPC poses an approximate solution to multiparametric programming problems emerging from explicit nonlinear model predictive control (MPC)
arXiv Detail & Related papers (2021-07-25T16:47:57Z) - Stable Online Control of Linear Time-Varying Systems [49.41696101740271]
COCO-LQ is an efficient online control algorithm that guarantees input-to-state stability for a large class of LTV systems.
We empirically demonstrate the performance of COCO-LQ in both synthetic experiments and a power system frequency control example.
arXiv Detail & Related papers (2021-04-29T06:18:49Z) - Anticipating the Long-Term Effect of Online Learning in Control [75.6527644813815]
AntLer is a design algorithm for learning-based control laws that anticipates learning.
We show that AntLer approximates an optimal solution arbitrarily accurately with probability one.
arXiv Detail & Related papers (2020-07-24T07:00:14Z) - Reduced-Dimensional Reinforcement Learning Control using Singular
Perturbation Approximations [9.136645265350284]
We present a set of model-free, reduced-dimensional reinforcement learning based optimal control designs for linear time-invariant singularly perturbed (SP) systems.
We first present a state-feedback and output-feedback based RL control design for a generic SP system with unknown state and input matrices.
We extend both designs to clustered multi-agent consensus networks, where the SP property reflects through clustering.
arXiv Detail & Related papers (2020-04-29T22:15:54Z) - Logarithmic Regret Bound in Partially Observable Linear Dynamical
Systems [91.43582419264763]
We study the problem of system identification and adaptive control in partially observable linear dynamical systems.
We present the first model estimation method with finite-time guarantees in both open and closed-loop system identification.
We show that AdaptOn is the first algorithm that achieves $textpolylogleft(Tright)$ regret in adaptive control of unknown partially observable linear dynamical systems.
arXiv Detail & Related papers (2020-03-25T06:00:33Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.