Regularity and stability of feedback relaxed controls
- URL: http://arxiv.org/abs/2001.03148v2
- Date: Fri, 23 Jul 2021 14:33:58 GMT
- Title: Regularity and stability of feedback relaxed controls
- Authors: Christoph Reisinger, Yufei Zhang
- Abstract summary: This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls.
We show that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations.
- Score: 4.48579723067867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a relaxed control regularization with general exploration
rewards to design robust feedback controls for multi-dimensional
continuous-time stochastic exit time problems. We establish that the
regularized control problem admits a H\"{o}lder continuous feedback control,
and demonstrate that both the value function and the feedback control of the
regularized control problem are Lipschitz stable with respect to parameter
perturbations. Moreover, we show that a pre-computed feedback relaxed control
has a robust performance in a perturbed system, and derive a first-order
sensitivity equation for both the value function and optimal feedback relaxed
control. These stability results provide a theoretical justification for recent
reinforcement learning heuristics that including an exploration reward in the
optimization objective leads to more robust decision making. We finally prove
first-order monotone convergence of the value functions for relaxed control
problems with vanishing exploration parameters, which subsequently enables us
to construct the pure exploitation strategy of the original control problem
based on the feedback relaxed controls.
Related papers
- On the stability of Lipschitz continuous control problems and its application to reinforcement learning [1.534667887016089]
We address the crucial yet underexplored stability properties of the Hamilton--Jacobi--Bellman (HJB) equation in model-free reinforcement learning contexts.
We bridge the gap between Lipschitz continuous optimal control problems and classical optimal control problems in the viscosity solutions framework.
arXiv Detail & Related papers (2024-04-20T08:21:25Z) - Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution [51.83951489847344]
In robotics applications, smooth control signals are commonly preferred to reduce system wear and energy efficiency.
In this work, we aim to bridge this performance gap by growing discrete action spaces from coarse to fine control resolution.
Our work indicates that an adaptive control resolution in combination with value decomposition yields simple critic-only algorithms that yield surprisingly strong performance on continuous control tasks.
arXiv Detail & Related papers (2024-04-05T17:58:37Z) - Robustness of Energy Landscape Controllers for Spin Rings under Coherent
Excitation Transport [0.0]
We examine the robustness of controllers designed to optimize the fidelity of an excitation transfer to uncertainty in system and control parameters.
We demonstrate that quantum systems optimized for coherent transport demonstrate significantly different correlation between error and the log-sensitivity depending on whether the controller is optimized for readout at an exact time T or over a time-window about T.
arXiv Detail & Related papers (2023-03-01T00:16:00Z) - Improving the Performance of Robust Control through Event-Triggered
Learning [74.57758188038375]
We propose an event-triggered learning algorithm that decides when to learn in the face of uncertainty in the LQR problem.
We demonstrate improved performance over a robust controller baseline in a numerical example.
arXiv Detail & Related papers (2022-07-28T17:36:37Z) - Recurrent Neural Network Controllers Synthesis with Stability Guarantees
for Partially Observed Systems [6.234005265019845]
We consider the important class of recurrent neural networks (RNN) as dynamic controllers for nonlinear uncertain partially-observed systems.
We propose a projected policy gradient method that iteratively enforces the stability conditions in the reparametrized space.
Numerical experiments show that our method learns stabilizing controllers while using fewer samples and achieving higher final performance compared with policy gradient.
arXiv Detail & Related papers (2021-09-08T18:21:56Z) - Regret-optimal Estimation and Control [52.28457815067461]
We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form.
We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended KalmanFilter (EKF) for systems with nonlinear dynamics.
arXiv Detail & Related papers (2021-06-22T23:14:21Z) - Closing the Closed-Loop Distribution Shift in Safe Imitation Learning [80.05727171757454]
We treat safe optimization-based control strategies as experts in an imitation learning problem.
We train a learned policy that can be cheaply evaluated at run-time and that provably satisfies the same safety guarantees as the expert.
arXiv Detail & Related papers (2021-02-18T05:11:41Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Certainty Equivalent Perception-Based Control [29.216967322052785]
We show a uniform error bound on non kernel regression under a dynamically-achievable dense sampling scheme.
This allows for a finite-time convergence rate on the sub-optimality of using the regressor in closed-loop for waypoint tracking.
arXiv Detail & Related papers (2020-08-27T18:45:40Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.