On the Optimization Landscape of Dynamic Output Feedback: A Case Study
for Linear Quadratic Regulator
- URL: http://arxiv.org/abs/2209.05042v3
- Date: Sun, 29 Oct 2023 14:09:22 GMT
- Title: On the Optimization Landscape of Dynamic Output Feedback: A Case Study
for Linear Quadratic Regulator
- Authors: Jingliang Duan, Wenhan Cao, Yang Zheng, Lin Zhao
- Abstract summary: We show how the dLQR cost varies with the coordinate transformation of the dynamic controller and then derive the optimal transformation for a given observable stabilizing controller.
These results shed light on designing efficient algorithms for general decision-making problems with partially observed information.
- Score: 12.255864026960403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The convergence of policy gradient algorithms in reinforcement learning
hinges on the optimization landscape of the underlying optimal control problem.
Theoretical insights into these algorithms can often be acquired from analyzing
those of linear quadratic control. However, most of the existing literature
only considers the optimization landscape for static full-state or output
feedback policies (controllers). We investigate the more challenging case of
dynamic output-feedback policies for linear quadratic regulation (abbreviated
as dLQR), which is prevalent in practice but has a rather complicated
optimization landscape. We first show how the dLQR cost varies with the
coordinate transformation of the dynamic controller and then derive the optimal
transformation for a given observable stabilizing controller. At the core of
our results is the uniqueness of the stationary point of dLQR when it is
observable, which is in a concise form of an observer-based controller with the
optimal similarity transformation. These results shed light on designing
efficient algorithms for general decision-making problems with partially
observed information.
Related papers
- A novel algorithm for optimizing bundle adjustment in image sequence alignment [6.322876598831792]
This paper introduces a novel algorithm for optimizing the Bundle Adjustment (BA) model in the context of image sequence alignment for cryo-electron tomography.
Extensive experiments on both synthetic and real-world datasets were conducted to evaluate the algorithm's performance.
arXiv Detail & Related papers (2024-11-10T03:19:33Z) - Optimal DLT-based Solutions for the Perspective-n-Point [0.0]
We propose a modified direct linear (DLT) algorithm for solving the perspective-n-point (Newton)
The modification consists analytically weighting the different measurements in the linear system with a negligible increase in computational load.
Our approach clears improvements in both performance and runtime.
arXiv Detail & Related papers (2024-10-18T04:04:58Z) - Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes [4.229902091180109]
We propose a novel, stability-certified IRL approach to learning control Lyapunov functions from demonstrations data.
By exploiting closed-form expressions for associated control policies, we are able to efficiently search the space of CLFs.
We present a theoretical analysis of the optimality properties provided by the CLF and evaluate our approach using both simulated and real-world data.
arXiv Detail & Related papers (2024-05-14T16:40:45Z) - Analyzing and Enhancing the Backward-Pass Convergence of Unrolled
Optimization [50.38518771642365]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
A central challenge in this setting is backpropagation through the solution of an optimization problem, which often lacks a closed form.
This paper provides theoretical insights into the backward pass of unrolled optimization, showing that it is equivalent to the solution of a linear system by a particular iterative method.
A system called Folded Optimization is proposed to construct more efficient backpropagation rules from unrolled solver implementations.
arXiv Detail & Related papers (2023-12-28T23:15:18Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Towards a Theoretical Foundation of Policy Optimization for Learning
Control Policies [26.04704565406123]
Gradient-based methods have been widely used for system design and optimization in diverse application domains.
Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning.
This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis.
arXiv Detail & Related papers (2022-10-10T16:13:34Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Learning Stochastic Optimal Policies via Gradient Descent [17.9807134122734]
We systematically develop a learning-based treatment of optimal control (SOC)
We propose a derivation of adjoint sensitivity results for differential equations through direct application of variational calculus.
We verify the performance of the proposed approach on a continuous-time, finite horizon portfolio optimization with proportional transaction costs.
arXiv Detail & Related papers (2021-06-07T16:43:07Z) - Pushing the Envelope of Rotation Averaging for Visual SLAM [69.7375052440794]
We propose a novel optimization backbone for visual SLAM systems.
We leverage averaging to improve the accuracy, efficiency and robustness of conventional monocular SLAM systems.
Our approach can exhibit up to 10x faster with comparable accuracy against the state-art on public benchmarks.
arXiv Detail & Related papers (2020-11-02T18:02:26Z) - Direct Optimal Control Approach to Laser-Driven Quantum Particle
Dynamics [77.34726150561087]
We propose direct optimal control as a robust and flexible alternative to indirect control theory.
The method is illustrated for the case of laser-driven wavepacket dynamics in a bistable potential.
arXiv Detail & Related papers (2020-10-08T07:59:29Z) - A Primer on Zeroth-Order Optimization in Signal Processing and Machine
Learning [95.85269649177336]
ZO optimization iteratively performs three major steps: gradient estimation, descent direction, and solution update.
We demonstrate promising applications of ZO optimization, such as evaluating and generating explanations from black-box deep learning models, and efficient online sensor management.
arXiv Detail & Related papers (2020-06-11T06:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.