Data-driven learning of feedback maps for explicit robust predictive control: an approximation theoretic view
- URL: http://arxiv.org/abs/2510.13522v1
- Date: Wed, 15 Oct 2025 13:14:14 GMT
- Title: Data-driven learning of feedback maps for explicit robust predictive control: an approximation theoretic view
- Authors: Siddhartha Ganguly, Shubham Gupta, Debasish Chatterjee,
- Abstract summary: We establish an algorithm to learn feedback maps from data for a class of robust model predictive control (MPC) problems.<n>We employ a couple of approximation schemes that furnish tight approximations within preassigned uniform error bounds on the admissible state space to learn the unknown feedback policy.
- Score: 15.111522780173777
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We establish an algorithm to learn feedback maps from data for a class of robust model predictive control (MPC) problems. The algorithm accounts for the approximation errors due to the learning directly at the synthesis stage, ensuring recursive feasibility by construction. The optimal control problem consists of a linear noisy dynamical system, a quadratic stage and quadratic terminal costs as the objective, and convex constraints on the state, control, and disturbance sequences; the control minimizes and the disturbance maximizes the objective. We proceed via two steps -- (a) Data generation: First, we reformulate the given minmax problem into a convex semi-infinite program and employ recently developed tools to solve it in an exact fashion on grid points of the state space to generate (state, action) data. (b) Learning approximate feedback maps: We employ a couple of approximation schemes that furnish tight approximations within preassigned uniform error bounds on the admissible state space to learn the unknown feedback policy. The stability of the closed-loop system under the approximate feedback policies is also guaranteed under a standard set of hypotheses. Two benchmark numerical examples are provided to illustrate the results.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - Full error analysis of policy gradient learning algorithms for exploratory linear quadratic mean-field control problem in continuous time with common noise [0.0]
We study policy gradient (PG) learning and first demonstrate convergence in a model-based setting.
We prove the global linear convergence and sample complexity of the PG algorithm with two-point gradient estimates in a model-free setting.
In this setting, the parameterized optimal policies are learned from samples of the states and population distribution.
arXiv Detail & Related papers (2024-08-05T14:11:51Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Deep Subspace Encoders for Nonlinear System Identification [0.0]
We propose a method which uses a truncated prediction loss and a subspace encoder for state estimation.
We show that, under mild conditions, the proposed method is locally consistent, increases optimization stability, and achieves increased data efficiency.
arXiv Detail & Related papers (2022-10-26T16:04:38Z) - FORESEE: Prediction with Expansion-Compression Unscented Transform for
Online Policy Optimization [8.97438370260135]
We introduce a method for state prediction, called the Expansion-Compression Unscented Transform, to solve a class of online policy optimization problems.
Our proposed algorithm propagates a finite number of sigma points through a state-dependent distribution, which dictates an increase in the number of sigma points at each time step.
Its performance is empirically shown to be comparable to Monte Carlo but at a much lower computational cost.
arXiv Detail & Related papers (2022-09-26T12:47:08Z) - Log Barriers for Safe Black-box Optimization with Application to Safe
Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial.
Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size.
We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z) - Informative Planning for Worst-Case Error Minimisation in Sparse
Gaussian Process Regression [28.50144453974764]
We derive a universal worst-case error bound for sparse GP regression with bounded noise.
We show that the worst-case error minimisation can be achieved by solving a posterior entropy minimisation problem.
arXiv Detail & Related papers (2022-03-08T03:26:08Z) - Robust and Adaptive Temporal-Difference Learning Using An Ensemble of
Gaussian Processes [70.80716221080118]
The paper takes a generative perspective on policy evaluation via temporal-difference (TD) learning.
The OS-GPTD approach is developed to estimate the value function for a given policy by observing a sequence of state-reward pairs.
To alleviate the limited expressiveness associated with a single fixed kernel, a weighted ensemble (E) of GP priors is employed to yield an alternative scheme.
arXiv Detail & Related papers (2021-12-01T23:15:09Z) - Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations [50.37808220291108]
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations.
We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety.
We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior.
arXiv Detail & Related papers (2021-11-18T23:21:00Z) - Primal-dual Learning for the Model-free Risk-constrained Linear
Quadratic Regulator [0.8629912408966145]
Risk-aware control, though with promise to tackle unexpected events, requires a known exact dynamical model.
We propose a model framework to learn a risk-aware controller with a focus on the linear system.
arXiv Detail & Related papers (2020-11-22T04:40:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.