On the stability of Lipschitz continuous control problems and its application to reinforcement learning
- URL: http://arxiv.org/abs/2404.13316v1
- Date: Sat, 20 Apr 2024 08:21:25 GMT
- Title: On the stability of Lipschitz continuous control problems and its application to reinforcement learning
- Authors: Namkyeong Cho, Yeoneung Kim,
- Abstract summary: We address the crucial yet underexplored stability properties of the Hamilton--Jacobi--Bellman (HJB) equation in model-free reinforcement learning contexts.
We bridge the gap between Lipschitz continuous optimal control problems and classical optimal control problems in the viscosity solutions framework.
- Score: 1.534667887016089
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We address the crucial yet underexplored stability properties of the Hamilton--Jacobi--Bellman (HJB) equation in model-free reinforcement learning contexts, specifically for Lipschitz continuous optimal control problems. We bridge the gap between Lipschitz continuous optimal control problems and classical optimal control problems in the viscosity solutions framework, offering new insights into the stability of the value function of Lipschitz continuous optimal control problems. By introducing structural assumptions on the dynamics and reward functions, we further study the rate of convergence of value functions. Moreover, we introduce a generalized framework for Lipschitz continuous control problems that incorporates the original problem and leverage it to propose a new HJB-based reinforcement learning algorithm. The stability properties and performance of the proposed method are tested with well-known benchmark examples in comparison with existing approaches.
Related papers
- Learning to Boost the Performance of Stable Nonlinear Systems [0.0]
We tackle the performance-boosting problem with closed-loop stability guarantees.
Our methods enable learning over arbitrarily deep neural network classes of performance-boosting controllers for stable nonlinear systems.
arXiv Detail & Related papers (2024-05-01T21:11:29Z) - On the Sample Complexity of Imitation Learning for Smoothed Model Predictive Control [27.609098229134]
We show how a smoothed expert can be designed for a general class of systems.
We prove on the optimality gap of the analytic center associated with a convex Lipschitz function.
arXiv Detail & Related papers (2023-06-02T20:43:38Z) - KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed
Stability in Nonlinear Dynamical Systems [66.9461097311667]
We propose a model-based reinforcement learning framework with formal stability guarantees.
The proposed method learns the system dynamics up to a confidence interval using feature representation.
We show that KCRL is guaranteed to learn a stabilizing policy in a finite number of interactions with the underlying unknown system.
arXiv Detail & Related papers (2022-06-03T17:27:04Z) - Stability Verification in Stochastic Control Systems via Neural Network
Supermartingales [17.558766911646263]
We present an approach for general nonlinear control problems with two novel aspects.
We use ranking supergales (RSMs) to certify a.s.asymptotic stability, and we present a method for learning neural networks.
arXiv Detail & Related papers (2021-12-17T13:05:14Z) - Pointwise Feasibility of Gaussian Process-based Safety-Critical Control
under Model Uncertainty [77.18483084440182]
Control Barrier Functions (CBFs) and Control Lyapunov Functions (CLFs) are popular tools for enforcing safety and stability of a controlled system, respectively.
We present a Gaussian Process (GP)-based approach to tackle the problem of model uncertainty in safety-critical controllers that use CBFs and CLFs.
arXiv Detail & Related papers (2021-06-13T23:08:49Z) - Reinforcement learning for linear-convex models with jumps via stability
analysis of feedback controls [7.969435896173812]
We study a finite linear-time continuous-time horizon learning problems an episodic setting.
In this problem, the unknown jump-dif process is controlled to nonsmooth convex costs.
arXiv Detail & Related papers (2021-04-19T13:50:52Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z) - Regularity and stability of feedback relaxed controls [4.48579723067867]
This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls.
We show that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations.
arXiv Detail & Related papers (2020-01-09T18:24:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.