End-to-End Learning Framework for Solving Non-Markovian Optimal Control
- URL: http://arxiv.org/abs/2502.04649v3
- Date: Fri, 14 Feb 2025 22:18:11 GMT
- Title: End-to-End Learning Framework for Solving Non-Markovian Optimal Control
- Authors: Xiaole Zhang, Peiyu Zhang, Xiongye Xiao, Shixuan Li, Vasileios Tzoumas, Vijay Gupta, Paul Bogdan,
- Abstract summary: We propose an innovative system identification method control strategy for FOLTI systems.
We also develop the first end-to-end data-driven learning framework, Fractional-Order Learning for Optimal Control (FOLOC)
- Score: 9.156265463755807
- License:
- Abstract: Integer-order calculus often falls short in capturing the long-range dependencies and memory effects found in many real-world processes. Fractional calculus addresses these gaps via fractional-order integrals and derivatives, but fractional-order dynamical systems pose substantial challenges in system identification and optimal control due to the lack of standard control methodologies. In this paper, we theoretically derive the optimal control via linear quadratic regulator (LQR) for fractional-order linear time-invariant (FOLTI) systems and develop an end-to-end deep learning framework based on this theoretical foundation. Our approach establishes a rigorous mathematical model, derives analytical solutions, and incorporates deep learning to achieve data-driven optimal control of FOLTI systems. Our key contributions include: (i) proposing an innovative system identification method control strategy for FOLTI systems, (ii) developing the first end-to-end data-driven learning framework, Fractional-Order Learning for Optimal Control (FOLOC), that learns control policies from observed trajectories, and (iii) deriving a theoretical analysis of sample complexity to quantify the number of samples required for accurate optimal control in complex real-world problems. Experimental results indicate that our method accurately approximates fractional-order system behaviors without relying on Gaussian noise assumptions, pointing to promising avenues for advanced optimal control.
Related papers
- Learning Controlled Stochastic Differential Equations [61.82896036131116]
This work proposes a novel method for estimating both drift and diffusion coefficients of continuous, multidimensional, nonlinear controlled differential equations with non-uniform diffusion.
We provide strong theoretical guarantees, including finite-sample bounds for (L2), (Linfty), and risk metrics, with learning rates adaptive to coefficients' regularity.
Our method is available as an open-source Python library.
arXiv Detail & Related papers (2024-11-04T11:09:58Z) - Active Learning for Control-Oriented Identification of Nonlinear Systems [26.231260751633307]
We present the first finite sample analysis of an active learning algorithm suitable for a general class of nonlinear dynamics.
In certain settings, the excess control cost of our algorithm achieves the optimal rate, up to logarithmic factors.
We validate our approach in simulation, showcasing the advantage of active, control-oriented exploration for controlling nonlinear systems.
arXiv Detail & Related papers (2024-04-13T15:40:39Z) - Optimal Exploration for Model-Based RL in Nonlinear Systems [14.540210895533937]
Learning to control unknown nonlinear dynamical systems is a fundamental problem in reinforcement learning and control theory.
We develop an algorithm able to efficiently explore the system to reduce uncertainty in a task-dependent metric.
Our algorithm relies on a general reduction from policy optimization to optimal experiment design in arbitrary systems, and may be of independent interest.
arXiv Detail & Related papers (2023-06-15T15:47:50Z) - Towards a Theoretical Foundation of Policy Optimization for Learning
Control Policies [26.04704565406123]
Gradient-based methods have been widely used for system design and optimization in diverse application domains.
Recently, there has been a renewed interest in studying theoretical properties of these methods in the context of control and reinforcement learning.
This article surveys some of the recent developments on policy optimization, a gradient-based iterative approach for feedback control synthesis.
arXiv Detail & Related papers (2022-10-10T16:13:34Z) - Sparsity in Partially Controllable Linear Systems [56.142264865866636]
We study partially controllable linear dynamical systems specified by an underlying sparsity pattern.
Our results characterize those state variables which are irrelevant for optimal control.
arXiv Detail & Related papers (2021-10-12T16:41:47Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Single-step deep reinforcement learning for open-loop control of laminar
and turbulent flows [0.0]
This research gauges the ability of deep reinforcement learning (DRL) techniques to assist the optimization and control of fluid mechanical systems.
It combines a novel, "degenerate" version of the prototypical policy optimization (PPO) algorithm, that trains a neural network in optimizing the system only once per learning episode.
arXiv Detail & Related papers (2020-06-04T16:11:26Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.