Robust Control with Gradient Uncertainty
- URL: http://arxiv.org/abs/2507.15082v1
- Date: Sun, 20 Jul 2025 18:37:30 GMT
- Title: Robust Control with Gradient Uncertainty
- Authors: Qian Qi,
- Abstract summary: We introduce a novel extension to robust control theory that explicitly addresses uncertainty in the value function's gradient.<n>This work holds significant implications for fields where function approximation is common, including reinforcement learning and computational finance.
- Score: 2.1756081703276
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce a novel extension to robust control theory that explicitly addresses uncertainty in the value function's gradient, a form of uncertainty endemic to applications like reinforcement learning where value functions are approximated. We formulate a zero-sum dynamic game where an adversary perturbs both system dynamics and the value function gradient, leading to a new, highly nonlinear partial differential equation: the Hamilton-Jacobi-Bellman-Isaacs Equation with Gradient Uncertainty (GU-HJBI). We establish its well-posedness by proving a comparison principle for its viscosity solutions under a uniform ellipticity condition. Our analysis of the linear-quadratic (LQ) case yields a key insight: we prove that the classical quadratic value function assumption fails for any non-zero gradient uncertainty, fundamentally altering the problem structure. A formal perturbation analysis characterizes the non-polynomial correction to the value function and the resulting nonlinearity of the optimal control law, which we validate with numerical studies. Finally, we bridge theory to practice by proposing a novel Gradient-Uncertainty-Robust Actor-Critic (GURAC) algorithm, accompanied by an empirical study demonstrating its effectiveness in stabilizing training. This work provides a new direction for robust control, holding significant implications for fields where function approximation is common, including reinforcement learning and computational finance.
Related papers
- The Vanishing Gradient Problem for Stiff Neural Differential Equations [3.941173292703699]
In stiff systems, it has been observed that sensitivities to parameters controlling fast-decaying modes become vanishingly small during training.<n>We show that this vanishing gradient phenomenon is not an artifact of any particular method, but a universal feature of all A-stable and L-stable stiff numerical integration schemes.
arXiv Detail & Related papers (2025-08-02T23:44:14Z) - An optimization-based equilibrium measure describes non-equilibrium steady state dynamics: application to edge of chaos [2.5690340428649328]
Understanding neural dynamics is a central topic in machine learning, non-linear physics and neuroscience.
The dynamics is non-linear, and particularly non-gradient, i.e., the driving force can not be written as gradient of a potential.
arXiv Detail & Related papers (2024-01-18T14:25:32Z) - Stochastic Nonlinear Control via Finite-dimensional Spectral Dynamic Embedding [21.38845517949153]
This paper proposes an approach, Spectral Dynamics Embedding Control (SDEC), to optimal control for nonlinear systems.<n>It reveals an infinite-dimensional feature representation induced by the system's nonlinear dynamics, enabling a linear representation of the state-action value function.
arXiv Detail & Related papers (2023-04-08T04:23:46Z) - Model-Based Uncertainty in Value Functions [89.31922008981735]
We focus on characterizing the variance over values induced by a distribution over MDPs.
Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation.
We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values.
arXiv Detail & Related papers (2023-02-24T09:18:27Z) - Robust Fitted-Q-Evaluation and Iteration under Sequentially Exogenous
Unobserved Confounders [16.193776814471768]
We study robust policy evaluation and policy optimization in the presence of sequentially-exogenous unobserved confounders.
We provide sample complexity bounds, insights, and show effectiveness both in simulations and on real-world longitudinal healthcare data of treating sepsis.
arXiv Detail & Related papers (2023-02-01T18:40:53Z) - Asymptotic consistency of the WSINDy algorithm in the limit of continuum
data [0.0]
We study the consistency of the weak-form sparse identification of nonlinear dynamics algorithm (WSINDy)
We provide a mathematically rigorous explanation for the observed robustness to noise of weak-form equation learning.
arXiv Detail & Related papers (2022-11-29T07:49:34Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Stochastic Langevin Differential Inclusions with Applications to Machine Learning [5.274477003588407]
We show some foundational results regarding the flow and properties of Langevin-type Differential Inclusions.
In particular, we show strong existence of the solution, as well as an canonical- minimization of the free-energy functional.
arXiv Detail & Related papers (2022-06-23T08:29:17Z) - On Convergence of Training Loss Without Reaching Stationary Points [62.41370821014218]
We show that Neural Network weight variables do not converge to stationary points where the gradient the loss function vanishes.
We propose a new perspective based on ergodic theory dynamical systems.
arXiv Detail & Related papers (2021-10-12T18:12:23Z) - Contraction Theory for Nonlinear Stability Analysis and Learning-based Control: A Tutorial Overview [17.05002635077646]
Contraction theory is an analytical tool to study differential dynamics of a non-autonomous (i.e., time-varying) nonlinear system.<n>It takes advantage of a superior property of exponential stability used in conjunction with the comparison lemma.<n>This yields much-needed safety and stability guarantees for neural network-based control and estimation schemes.
arXiv Detail & Related papers (2021-10-01T23:03:21Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z) - On dissipative symplectic integration with applications to
gradient-based optimization [77.34726150561087]
We propose a geometric framework in which discretizations can be realized systematically.
We show that a generalization of symplectic to nonconservative and in particular dissipative Hamiltonian systems is able to preserve rates of convergence up to a controlled error.
arXiv Detail & Related papers (2020-04-15T00:36:49Z) - Convergence and sample complexity of gradient methods for the model-free
linear quadratic regulator problem [27.09339991866556]
We show that ODE searches for optimal control for an unknown computation system by directly searching over the corresponding space of controllers.
We take a step towards demystifying the performance and efficiency of such methods by focusing on the gradient-flow dynamics set of stabilizing feedback gains and a similar result holds for the forward disctization of the ODE.
arXiv Detail & Related papers (2019-12-26T16:56:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.