Coarse-Grained Smoothness for RL in Metric Spaces
- URL: http://arxiv.org/abs/2110.12276v1
- Date: Sat, 23 Oct 2021 18:53:56 GMT
- Title: Coarse-Grained Smoothness for RL in Metric Spaces
- Authors: Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George
Konidaris, Michael Littman
- Abstract summary: A common approach is to assume Lipschitz continuity of the Q-function.
We show that, unfortunately, this property fails to hold in many typical domains.
We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity.
- Score: 13.837098609529257
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Principled decision-making in continuous state--action spaces is impossible
without some assumptions. A common approach is to assume Lipschitz continuity
of the Q-function. We show that, unfortunately, this property fails to hold in
many typical domains. We propose a new coarse-grained smoothness definition
that generalizes the notion of Lipschitz continuity, is more widely applicable,
and allows us to compute significantly tighter bounds on Q-functions, leading
to improved learning. We provide a theoretical analysis of our new smoothness
definition, and discuss its implications and impact on control and exploration
in continuous domains.
Related papers
- Simultaneously Solving Infinitely Many LQ Mean Field Games In Hilbert Spaces: The Power of Neural Operators [11.756276506921976]
Training neural operators (NOs) to learn the rules-to-equilibrium map from the problem data (rules'': dynamics and cost functionals) of LQ MFGs defined on separable Hilbert spaces to the corresponding equilibrium strategy.<n>An NO trained on a small number of randomly sampled rules reliably solves unseen LQ MFG variants, even in infinite-dimensional settings.<n>Our guarantee follows from three results: (i) local-Lipschitz estimates for the highly nonlinear rules-to-equilibrium map; (ii) a universal approximation theorem using NOs with a prespecified Lipschitz regularity;
arXiv Detail & Related papers (2025-10-22T20:40:20Z) - New Tight Bounds for SGD without Variance Assumption: A Computer-Aided Lyapunov Analysis [0.3937354192623676]
This paper contributes to a recent line of works which attempt to provide guarantees without making any variance assumption.<n>We prove new theoretical bounds derived from the monotonicity of a simple Lyapunov energy.
arXiv Detail & Related papers (2025-05-23T14:34:46Z) - Achieving Domain-Independent Certified Robustness via Knowledge Continuity [21.993471256103085]
We present knowledge continuity, a novel definition inspired by Lipschitz continuity.
Our proposed definition yields certification guarantees that depend only on the loss function and the intermediate learned metric spaces of the neural network.
We show that knowledge continuity can be used to localize vulnerable components of a neural network.
arXiv Detail & Related papers (2024-11-03T17:37:59Z) - Taming Nonconvex Stochastic Mirror Descent with General Bregman
Divergence [25.717501580080846]
This paper revisits the convergence of gradient Forward Mirror (SMD) in the contemporary non optimization setting.
For the training, we develop provably convergent algorithms for the problem of linear networks.
arXiv Detail & Related papers (2024-02-27T17:56:49Z) - Wasserstein Actor-Critic: Directed Exploration via Optimism for
Continuous-Actions Control [41.7453231409493]
Wasserstein Actor-Critic ( WAC) is an actor-critic architecture inspired by the Wasserstein Q-Learning (WQL) citepwql.
WAC enforces exploration in a principled way by guiding the policy learning process with the optimization of an upper bound of the Q-value estimates.
arXiv Detail & Related papers (2023-03-04T10:52:20Z) - Learning to Optimize with Stochastic Dominance Constraints [103.26714928625582]
In this paper, we develop a simple yet efficient approach for the problem of comparing uncertain quantities.
We recast inner optimization in the Lagrangian as a learning problem for surrogate approximation, which bypasses apparent intractability.
The proposed light-SD demonstrates superior performance on several representative problems ranging from finance to supply chain management.
arXiv Detail & Related papers (2022-11-14T21:54:31Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Lipschitz Continuity Retained Binary Neural Network [52.17734681659175]
We introduce the Lipschitz continuity as the rigorous criteria to define the model robustness for BNN.
We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness.
Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN.
arXiv Detail & Related papers (2022-07-13T22:55:04Z) - Functional Generalized Empirical Likelihood Estimation for Conditional
Moment Restrictions [19.39005034948997]
We propose a new estimation method based on generalized empirical likelihood (GEL)
GEL provides a more general framework and has been shown to enjoy favorable small-sample properties compared to GMM-based estimators.
We provide kernel- and neural network-based implementations of the estimator, which achieve state-of-the-art empirical performance on two conditional moment restriction problems.
arXiv Detail & Related papers (2022-07-11T11:02:52Z) - Beyond the Edge of Stability via Two-step Gradient Updates [49.03389279816152]
Gradient Descent (GD) is a powerful workhorse of modern machine learning.
GD's ability to find local minimisers is only guaranteed for losses with Lipschitz gradients.
This work focuses on simple, yet representative, learning problems via analysis of two-step gradient updates.
arXiv Detail & Related papers (2022-06-08T21:32:50Z) - Sparsest Univariate Learning Models Under Lipschitz Constraint [31.28451181040038]
We propose continuous-domain formulations for one-dimensional regression problems.
We control the Lipschitz constant explicitly using a user-defined upper-bound.
We show that both problems admit global minimizers that are continuous and piecewise-linear.
arXiv Detail & Related papers (2021-12-27T07:03:43Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z) - Exactly Computing the Local Lipschitz Constant of ReLU Networks [98.43114280459271]
The local Lipschitz constant of a neural network is a useful metric for robustness, generalization, and fairness evaluation.
We show strong inapproximability results for estimating Lipschitz constants of ReLU networks.
We leverage this algorithm to evaluate the tightness of competing Lipschitz estimators and the effects of regularized training on the Lipschitz constant.
arXiv Detail & Related papers (2020-03-02T22:15:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.