A Distributionally Robust Approach to Regret Optimal Control using the
Wasserstein Distance
- URL: http://arxiv.org/abs/2304.06783v2
- Date: Wed, 16 Aug 2023 15:38:17 GMT
- Title: A Distributionally Robust Approach to Regret Optimal Control using the
Wasserstein Distance
- Authors: Feras Al Taha, Shuhao Yan, Eilyan Bitar
- Abstract summary: We show that causal linear disturbance feedback controllers are designed to minimize the worst-case expected regret.
We derive a reformulation of the minimax regret optimal control problem as a tractable semidefinite program.
We compare the minimax regret optimal control design method with the distributionally robust optimal control approach.
- Score: 1.8876415010297893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a distributionally robust approach to regret optimal
control of discrete-time linear dynamical systems with quadratic costs subject
to a stochastic additive disturbance on the state process. The underlying
probability distribution of the disturbance process is unknown, but assumed to
lie in a given ball of distributions defined in terms of the type-2 Wasserstein
distance. In this framework, strictly causal linear disturbance feedback
controllers are designed to minimize the worst-case expected regret. The regret
incurred by a controller is defined as the difference between the cost it
incurs in response to a realization of the disturbance process and the cost
incurred by the optimal noncausal controller which has perfect knowledge of the
disturbance process realization at the outset. Building on a well-established
duality theory for optimal transport problems, we derive a reformulation of the
minimax regret optimal control problem as a tractable semidefinite program.
Using the equivalent dual reformulation, we characterize a worst-case
distribution achieving the worst-case expected regret in relation to the
distribution at the center of the Wasserstein ball. We compare the minimax
regret optimal control design method with the distributionally robust optimal
control approach using an illustrative example and numerical experiments.
Related papers
- Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control [8.863520091178335]
We consider a diffusion control problem of expected terminal numerical utility.<n>The controller imposes a prior distribution on the unknown drift of an underlying diffusion.<n>In practice, the prior will generally be incorrectly specified, and the degree of model misspecification can have a significant impact on policy performance.<n>We introduce a distributionally robust Bayesian control (DRBC) formulation in which the controller plays a game against an adversary who selects a prior in divergence neighborhood of a baseline prior.
arXiv Detail & Related papers (2025-06-24T03:58:49Z) - Constrained Reinforcement Learning using Distributional Representation for Trustworthy Quadrotor UAV Tracking Control [2.325021848829375]
We propose a novel trajectory tracker integrating a Distributional Reinforcement Learning disturbance estimator for unknown aerodynamic effects.
The proposed estimator Constrained Distributional Reinforced disturbance estimator' (ConsDRED) accurately identifies uncertainties between true and estimated values of aerodynamic effects.
We demonstrate our system improves accumulative tracking errors by at least 70% compared with the recent art.
arXiv Detail & Related papers (2023-02-22T23:15:56Z) - Stochastic optimal well control in subsurface reservoirs using
reinforcement learning [0.0]
We present a case study of model-free reinforcement learning framework to solve optimal control for a predefined parameter uncertainty distribution.
In principle, RL algorithms are capable of learning optimal action policies to maximize a numerical reward signal.
We present numerical results using two state-of-the-art RL algorithms, proximal policy optimization (PPO) and advantage actor-critic (A2C) on two subsurface flow test cases.
arXiv Detail & Related papers (2022-07-07T17:34:23Z) - Wasserstein Distributionally Robust Estimation in High Dimensions:
Performance Analysis and Optimal Hyperparameter Tuning [0.0]
We propose a Wasserstein distributionally robust estimation framework to estimate an unknown parameter from noisy linear measurements.
We focus on the task of analyzing the squared error performance of such estimators.
We show that the squared error can be recovered as the solution of a convex-concave optimization problem.
arXiv Detail & Related papers (2022-06-27T13:02:59Z) - Regret-optimal Estimation and Control [52.28457815067461]
We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form.
We propose regret-optimal analogs of Model-Predictive Control (MPC) and the Extended KalmanFilter (EKF) for systems with nonlinear dynamics.
arXiv Detail & Related papers (2021-06-22T23:14:21Z) - Regret-Optimal Filtering [57.51328978669528]
We consider the problem of filtering in linear state-space models through the lens of regret optimization.
We formulate a novel criterion for filter design based on the concept of regret between the estimation error energy of a clairvoyant estimator.
We show that the regret-optimal estimator can be easily implemented by solving three Riccati equations and a single Lyapunov equation.
arXiv Detail & Related papers (2021-01-25T19:06:52Z) - Stein Variational Model Predictive Control [130.60527864489168]
Decision making under uncertainty is critical to real-world, autonomous systems.
Model Predictive Control (MPC) methods have demonstrated favorable performance in practice, but remain limited when dealing with complex distributions.
We show that this framework leads to successful planning in challenging, non optimal control problems.
arXiv Detail & Related papers (2020-11-15T22:36:59Z) - Reinforcement Learning for Low-Thrust Trajectory Design of
Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances.
An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted.
The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z) - Online Stochastic Convex Optimization: Wasserstein Distance Variation [15.313864176694832]
We consider an online proximal-gradient method to track the minimizers of expectations of smooth convex functions.
We revisit the concepts of estimation and tracking error inspired by systems and control literature.
We provide bounds for them under strong convexity, Lipschitzness of the gradient, and bounds on the probability distribution drift.
arXiv Detail & Related papers (2020-06-02T05:23:22Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z) - Improper Learning for Non-Stochastic Control [78.65807250350755]
We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states.
Applying online descent to this parametrization yields a new controller which attains sublinear regret vs. a large class of closed-loop policies.
Our bounds are the first in the non-stochastic control setting that compete with emphall stabilizing linear dynamical controllers.
arXiv Detail & Related papers (2020-01-25T02:12:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.