Boosting the Confidence of Generalization for $L_2$-Stable Randomized
Learning Algorithms
- URL: http://arxiv.org/abs/2206.03834v1
- Date: Wed, 8 Jun 2022 12:14:01 GMT
- Title: Boosting the Confidence of Generalization for $L_2$-Stable Randomized
Learning Algorithms
- Authors: Xiao-Tong Yuan and Ping Li
- Abstract summary: We show that a properly designed subbagging process leads to near-tight exponential generalization bounds over both data and algorithm.
We further derive generic results to improve high-probability generalization bounds for convex or non-tight problems with natural decaying learning rates.
- Score: 41.082982732100696
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Exponential generalization bounds with near-tight rates have recently been
established for uniformly stable learning algorithms. The notion of uniform
stability, however, is stringent in the sense that it is invariant to the
data-generating distribution. Under the weaker and distribution dependent
notions of stability such as hypothesis stability and $L_2$-stability, the
literature suggests that only polynomial generalization bounds are possible in
general cases. The present paper addresses this long standing tension between
these two regimes of results and makes progress towards relaxing it inside a
classic framework of confidence-boosting. To this end, we first establish an
in-expectation first moment generalization error bound for potentially
randomized learning algorithms with $L_2$-stability, based on which we then
show that a properly designed subbagging process leads to near-tight
exponential generalization bounds over the randomness of both data and
algorithm. We further substantialize these generic results to stochastic
gradient descent (SGD) to derive improved high-probability generalization
bounds for convex or non-convex optimization problems with natural time
decaying learning rates, which have not been possible to prove with the
existing hypothesis stability or uniform stability based results.
Related papers
- High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Benign Underfitting of Stochastic Gradient Descent [72.38051710389732]
We study to what extent may gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit training data.
We analyze the closely related with-replacement SGD, for which an analogous phenomenon does not occur and prove that its population risk does in fact converge at the optimal rate.
arXiv Detail & Related papers (2022-02-27T13:25:01Z) - Private Robust Estimation by Stabilizing Convex Relaxations [22.513117502159922]
$(epsilon, delta)$-differentially private (DP)
$(epsilon, delta)$-differentially private (DP)
$(epsilon, delta)$-differentially private (DP)
arXiv Detail & Related papers (2021-12-07T07:47:37Z) - Multi-fidelity Stability for Graph Representation Learning [38.31487722188051]
We introduce a weaker uniform generalization termed emphmulti-fidelity stability and give an example.
We present lower bounds for the discrepancy between the two types of stability, which justified the multi-fidelity design.
arXiv Detail & Related papers (2021-11-25T01:33:41Z) - Regularization Guarantees Generalization in Bayesian Reinforcement
Learning through Algorithmic Stability [48.62272919754204]
We study generalization in Bayesian RL under the probably approximately correct (PAC) framework.
Our main contribution is showing that by adding regularization, the optimal policy becomes stable in an appropriate sense.
arXiv Detail & Related papers (2021-09-24T07:48:34Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Toward Better Generalization Bounds with Locally Elastic Stability [41.7030651617752]
We argue that locally elastic stability implies tighter generalization bounds than those derived based on uniform stability.
We revisit the examples of bounded support vector machines, regularized least square regressions, and gradient descent.
arXiv Detail & Related papers (2020-10-27T02:04:53Z) - Statistical optimality and stability of tangent transform algorithms in
logit models [6.9827388859232045]
We provide conditions on the data generating process to derive non-asymptotic upper bounds to the risk incurred by the logistical optima.
In particular, we establish local variation of the algorithm without any assumptions on the data-generating process.
We explore a special case involving a semi-orthogonal design under which a global convergence is obtained.
arXiv Detail & Related papers (2020-10-25T05:15:13Z) - Fine-Grained Analysis of Stability and Generalization for Stochastic
Gradient Descent [55.85456985750134]
We introduce a new stability measure called on-average model stability, for which we develop novel bounds controlled by the risks of SGD iterates.
This yields generalization bounds depending on the behavior of the best model, and leads to the first-ever-known fast bounds in the low-noise setting.
To our best knowledge, this gives the firstever-known stability and generalization for SGD with even non-differentiable loss functions.
arXiv Detail & Related papers (2020-06-15T06:30:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.