Related papers: Kullback-Leibler control for discrete-time nonlinear systems on continuous spaces

Kullback-Leibler control for discrete-time nonlinear systems on continuous spaces

URL: http://arxiv.org/abs/2203.12864v1
Date: Thu, 24 Mar 2022 06:03:42 GMT
Title: Kullback-Leibler control for discrete-time nonlinear systems on continuous spaces
Authors: Kaito Ito, Kenji Kashima
Abstract summary: Kullback-Leibler (KL) control enables efficient numerical methods for nonlinear optimal control problems. We show that the reformulated KL control admits efficient numerical algorithms like the original one without unreasonable assumptions.
Score: 0.24366811507669117
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Kullback-Leibler (KL) control enables efficient numerical methods for nonlinear optimal control problems. The crucial assumption of KL control is the full controllability of the transition distribution. However, this assumption is often violated when the dynamics evolves in a continuous space. Consequently, applying KL control to problems with continuous spaces requires some approximation, which leads to the lost of the optimality. To avoid such approximation, in this paper, we reformulate the KL control problem for continuous spaces so that it does not require unrealistic assumptions. The key difference between the original and reformulated KL control is that the former measures the control effort by KL divergence between controlled and uncontrolled transition distributions while the latter replaces the uncontrolled transition by a noise-driven transition. We show that the reformulated KL control admits efficient numerical algorithms like the original one without unreasonable assumptions. Specifically, the associated value function can be computed by using a Monte Carlo method based on its path integral representation.

Related papers

Well-Posed KL-Regularized Control via Wasserstein and Kalman-Wasserstein KL Divergences [0.0]
Kullback-Leibler divergence (KL) regularization is widely used in reinforcement learning, but it becomes infinite under support mismatch and can degenerate in low-noise limits.<n>We introduce (Kalman)-Wasserstein-based KL analogues by replacing the Fisher-Rao geometry in the dynamical formulation of the KL with transport-based singularity.<n>We demonstrate the utility of these divergences in KL-regularized optimal control.
arXiv Detail & Related papers (2026-02-02T15:57:32Z)
Unifying Entropy Regularization in Optimal Control: From and Back to Classical Objectives via Iterated Soft Policies and Path Integral Solutions [4.934817254755008]
This paper develops a unified perspective on several optimal control formulations through the lens of Kullback-Leibler regularization.<n>We propose a central problem that separates the KL penalties on policies and transitions, assigning them independent weights.<n>We show that these soft-policy formulations majorize the original SOC and RSOC problem. This means that the regularized solution can be iterated to retrieve the original solution.
arXiv Detail & Related papers (2025-12-05T19:31:39Z)
Verifying Closed-Loop Contractivity of Learning-Based Controllers via Partitioning [52.23804865017831]
We address the problem of verifying closed-loop contraction in nonlinear control systems whose controller and contraction metric are both parameterized by neural networks.<n>We derive a tractable and scalable sufficient condition for closed-loop contractivity that reduces to checking that the dominant eigenvalue of a symmetric Metzler matrix is nonpositive.
arXiv Detail & Related papers (2025-12-01T23:06:56Z)
Neural Port-Hamiltonian Models for Nonlinear Distributed Control: An Unconstrained Parametrization Approach [0.0]
Neural Networks (NNs) can be leveraged to parametrize control policies that yield good performance. NNs' sensitivity to small input changes poses a risk of destabilizing the closed-loop system. To address these problems, we leverage the framework of port-Hamiltonian systems to design continuous-time distributed control policies. The effectiveness of the proposed distributed controllers is demonstrated through consensus control of non-holonomic mobile robots.
arXiv Detail & Related papers (2024-11-15T10:44:29Z)
Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear Quadratic Control [85.22735611954694]
We study the problem of adaptive control of stabilizable linear-quadratic regulators (LQRs) using Thompson Sampling (TS) We propose an efficient TS algorithm for the adaptive control of LQRs, TSAC, that attains $tilde O(sqrtT)$ regret, even for multidimensional systems.
arXiv Detail & Related papers (2022-06-17T02:47:53Z)
On optimization of coherent and incoherent controls for two-level quantum systems [77.34726150561087]
This article considers some control problems for closed and open two-level quantum systems. The closed system's dynamics is governed by the Schr"odinger equation with coherent control. The open system's dynamics is governed by the Gorini-Kossakowski-Sudarshan-Lindblad master equation.
arXiv Detail & Related papers (2022-05-05T09:08:03Z)
Correct-by-construction reach-avoid control of partially observable linear stochastic systems [7.912008109232803]
We formalize a robust feedback controller for reach-avoid control of discrete-time, linear time-invariant systems. The problem is to compute a controller that satisfies the required provestate abstraction problem.
arXiv Detail & Related papers (2021-03-03T13:46:52Z)
Improper Learning with Gradient-based Policy Optimization [62.50997487685586]
We consider an improper reinforcement learning setting where the learner is given M base controllers for an unknown Markov Decision Process. We propose a gradient-based approach that operates over a class of improper mixtures of the controllers.
arXiv Detail & Related papers (2021-02-16T14:53:55Z)
Policy Analysis using Synthetic Controls in Continuous-Time [101.35070661471124]
Counterfactual estimation using synthetic controls is one of the most successful recent methodological developments in causal inference. We propose a continuous-time alternative that models the latent counterfactual path explicitly using the formalism of controlled differential equations.
arXiv Detail & Related papers (2021-02-02T16:07:39Z)
Gaussian Process-based Min-norm Stabilizing Controller for Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem. We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z)
Gradient Flows for Regularized Stochastic Control Problems [7.801972633035922]
We study control problems with the action space taken to be probability measures with the objective penalised by the relative entropy. We identify suitable metric space on which we construct a gradient flow for the measure-valued control process.
arXiv Detail & Related papers (2020-06-10T17:07:36Z)
Adaptive Control and Regret Minimization in Linear Quadratic Gaussian (LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty. LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
A homotopy approach to coherent quantum LQG control synthesis using discounted performance criteria [2.0508733018954843]
This paper is concerned with linear-quadratic-Gaussian (LQG) control for a field-mediated feedback connection of a plant and a coherent (measurement-free) controller. The control objective is to make the closed-loop system internally stable and to minimize the infinite-horizon cost involving the plant variables.
arXiv Detail & Related papers (2020-02-06T18:52:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.