A Test-Function Approach to Incremental Stability
- URL: http://arxiv.org/abs/2507.00695v1
- Date: Tue, 01 Jul 2025 11:46:52 GMT
- Title: A Test-Function Approach to Incremental Stability
- Authors: Daniel Pfrommer, Max Simchowitz, Ali Jadbabaie,
- Abstract summary: The regularity of value functions, and their connection to incremental stability, can be understood in a way that is distinct from the traditional Lyapunov-based approach to certifying stability in control theory.
- Score: 33.44344966171865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel framework for analyzing Incremental-Input-to-State Stability ($\delta$ISS) based on the idea of using rewards as "test functions." Whereas control theory traditionally deals with Lyapunov functions that satisfy a time-decrease condition, reinforcement learning (RL) value functions are constructed by exponentially decaying a Lipschitz reward function that may be non-smooth and unbounded on both sides. Thus, these RL-style value functions cannot be directly understood as Lyapunov certificates. We develop a new equivalence between a variant of incremental input-to-state stability of a closed-loop system under given a policy, and the regularity of RL-style value functions under adversarial selection of a H\"older-continuous reward function. This result highlights that the regularity of value functions, and their connection to incremental stability, can be understood in a way that is distinct from the traditional Lyapunov-based approach to certifying stability in control theory.
Related papers
- Certifying Stability of Reinforcement Learning Policies using Generalized Lyapunov Functions [15.306107403623075]
We study the problem of certifying the stability of closed-loop systems under control policies derived from optimal control or reinforcement learning (RL)<n>Classical Lyapunov methods require a strict step-wise decrease in the Lyapunov function but such a certificate is difficult to construct for a learned control policy.<n>We formulate an approach to learn generalized Lyapunov functions by augmenting RL value functions with neural network residual terms.
arXiv Detail & Related papers (2025-05-16T07:36:40Z) - On the stability of Lipschitz continuous control problems and its application to reinforcement learning [1.534667887016089]
We address the crucial yet underexplored stability properties of the Hamilton--Jacobi--Bellman (HJB) equation in model-free reinforcement learning contexts.
We bridge the gap between Lipschitz continuous optimal control problems and classical optimal control problems in the viscosity solutions framework.
arXiv Detail & Related papers (2024-04-20T08:21:25Z) - On the continuity and smoothness of the value function in reinforcement learning and optimal control [1.534667887016089]
We show that the value function is always H"older continuous under relatively weak assumptions on the underlying system.
We also show that non-differentiable value functions can be made differentiable by slightly "disturbing" the system.
arXiv Detail & Related papers (2024-03-21T14:39:28Z) - Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning [20.07425661382103]
$Q$-Aided Conditional Supervised Learning combines stability of RCSL with the stitching capability of $Q$-functions.
QCS adaptively integrates $Q$-aid into RCSL's loss function based on trajectory return.
arXiv Detail & Related papers (2024-02-03T04:17:09Z) - Online non-parametric likelihood-ratio estimation by Pearson-divergence
functional minimization [55.98760097296213]
We introduce a new framework for online non-parametric LRE (OLRE) for the setting where pairs of iid observations $(x_t sim p, x'_t sim q)$ are observed over time.
We provide theoretical guarantees for the performance of the OLRE method along with empirical validation in synthetic experiments.
arXiv Detail & Related papers (2023-11-03T13:20:11Z) - Confidence-Conditioned Value Functions for Offline Reinforcement
Learning [86.59173545987984]
We propose a new form of Bellman backup that simultaneously learns Q-values for any degree of confidence with high probability.
We theoretically show that our learned value functions produce conservative estimates of the true value at any desired confidence.
arXiv Detail & Related papers (2022-12-08T23:56:47Z) - Robust Reinforcement Learning in Continuous Control Tasks with
Uncertainty Set Regularization [17.322284328945194]
Reinforcement learning (RL) is recognized as lacking generalization and robustness under environmental perturbations.
We propose a new regularizer named $textbfU$ncertainty $textbfS$et $textbfR$egularizer (USR)
arXiv Detail & Related papers (2022-07-05T12:56:08Z) - Robust and Adaptive Temporal-Difference Learning Using An Ensemble of
Gaussian Processes [70.80716221080118]
The paper takes a generative perspective on policy evaluation via temporal-difference (TD) learning.
The OS-GPTD approach is developed to estimate the value function for a given policy by observing a sequence of state-reward pairs.
To alleviate the limited expressiveness associated with a single fixed kernel, a weighted ensemble (E) of GP priors is employed to yield an alternative scheme.
arXiv Detail & Related papers (2021-12-01T23:15:09Z) - Gaussian Process-based Min-norm Stabilizing Controller for
Control-Affine Systems with Uncertain Input Effects and Dynamics [90.81186513537777]
We propose a novel compound kernel that captures the control-affine nature of the problem.
We show that this resulting optimization problem is convex, and we call it Gaussian Process-based Control Lyapunov Function Second-Order Cone Program (GP-CLF-SOCP)
arXiv Detail & Related papers (2020-11-14T01:27:32Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z) - Stable Reinforcement Learning with Unbounded State Space [27.053432445897016]
We consider the problem of reinforcement learning with unbounded state space motivated by the classical problem of scheduling in a queueing network.
Traditional policies as well as error metric that are designed for finite, bounded or compact state space, require infinite samples for providing meaningful performance guarantee.
We propose stability as the notion of "goodness": the state dynamics under the policy should remain in a bounded region with high probability.
arXiv Detail & Related papers (2020-06-08T05:00:25Z) - Distributional Robustness and Regularization in Reinforcement Learning [62.23012916708608]
We introduce a new regularizer for empirical value functions and show that it lower bounds the Wasserstein distributionally robust value function.
It suggests using regularization as a practical tool for dealing with $textitexternal uncertainty$ in reinforcement learning.
arXiv Detail & Related papers (2020-03-05T19:56:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.