Stability and Sharper Risk Bounds with Convergence Rate $O(1/n^2)$
- URL: http://arxiv.org/abs/2410.09766v1
- Date: Sun, 13 Oct 2024 07:50:47 GMT
- Title: Stability and Sharper Risk Bounds with Convergence Rate $O(1/n^2)$
- Authors: Bowei Zhu, Shaojie Li, Yong Liu,
- Abstract summary: The sharpest known high probability excess risk bounds are up to $Oleft( 1/nright)$ for empirical risk minimization and projected descent via algorithmic stability.
- Score: 23.380477456114118
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The sharpest known high probability excess risk bounds are up to $O\left( 1/n \right)$ for empirical risk minimization and projected gradient descent via algorithmic stability (Klochkov \& Zhivotovskiy, 2021). In this paper, we show that high probability excess risk bounds of order up to $O\left( 1/n^2 \right)$ are possible. We discuss how high probability excess risk bounds reach $O\left( 1/n^2 \right)$ under strongly convexity, smoothness and Lipschitz continuity assumptions for empirical risk minimization, projected gradient descent and stochastic gradient descent. Besides, to the best of our knowledge, our high probability results on the generalization gap measured by gradients for nonconvex problems are also the sharpest.
Related papers
- Optimal Excess Risk Bounds for Empirical Risk Minimization on $p$-Norm Linear Regression [19.31269916674961]
We show that, in the realizable case, under no moment assumptions, $O(d)$ samples are enough to exactly recover the target.
We extend this result to the case $p in (1, 2)$ under mild assumptions that guarantee the existence of the Hessian of the risk at its minimizer.
arXiv Detail & Related papers (2023-10-19T03:21:28Z) - Risk Estimation in a Markov Cost Process: Lower and Upper Bounds [3.1484174280822845]
We tackle the problem of estimating risk measures of the infinite-horizon discounted cost within a Markov cost process.
The risk measures we study include variance, Value-at-Risk (VaR), and Conditional Value-at-Risk (CVaR)
arXiv Detail & Related papers (2023-10-17T16:35:39Z) - Regret Distribution in Stochastic Bandits: Optimal Trade-off between
Expectation and Tail Risk [22.843623578307707]
We study the trade-off between expectation and tail risk for regret distribution in the multi-armed bandit problem.
We show how the order of expected regret exactly affects the decaying rate of the regret tail probability for both the worst-case and instance-dependent scenario.
arXiv Detail & Related papers (2023-04-10T01:00:18Z) - Estimating the minimizer and the minimum value of a regression function
under passive design [72.85024381807466]
We propose a new method for estimating the minimizer $boldsymbolx*$ and the minimum value $f*$ of a smooth and strongly convex regression function $f$.
We derive non-asymptotic upper bounds for the quadratic risk and optimization error of $boldsymbolz_n$, and for the risk of estimating $f*$.
arXiv Detail & Related papers (2022-11-29T18:38:40Z) - High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
Stepsize [55.0090961425708]
We propose a new, simplified high probability analysis of AdaGrad for smooth, non- probability problems.
We present our analysis in a modular way and obtain a complementary $mathcal O (1 / TT)$ convergence rate in the deterministic setting.
To the best of our knowledge, this is the first high probability for AdaGrad with a truly adaptive scheme, i.e., completely oblivious to the knowledge of smoothness.
arXiv Detail & Related papers (2022-04-06T13:50:33Z) - High-probability Bounds for Non-Convex Stochastic Optimization with
Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails.
We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z) - Stability and Deviation Optimal Risk Bounds with Convergence Rate
$O(1/n)$ [4.1499725848998965]
We show a high probability excess risk bound with the rate $O(log n/n)$ for strongly convex and Lipschitz losses valid for emphany empirical risk minimization method.
We discuss how $O(log n/n)$ high probability excess risk bounds are possible for projected gradient descent in the case of strongly convex and Lipschitz losses without the usual smoothness assumption.
arXiv Detail & Related papers (2021-03-22T17:28:40Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff
in Regret [115.85354306623368]
We study risk-sensitive reinforcement learning in episodic Markov decision processes with unknown transition kernels.
We propose two provably efficient model-free algorithms, Risk-Sensitive Value Iteration (RSVI) and Risk-Sensitive Q-learning (RSQ)
We prove that RSVI attains an $tildeObig(lambda(|beta| H2) cdot sqrtH3 S2AT big)$ regret, while RSQ attains an $tildeObig(lambda
arXiv Detail & Related papers (2020-06-22T19:28:26Z) - A Brief Prehistory of Double Descent [75.37825440319975]
Belkin et al. illustrate and discuss the shape of risk curves in the context of modern high-complexity learners.
With increasing $N$, the risk initially decreases, attains a minimum, and then increases until $N$ equals $n$, where the training data is fitted perfectly.
arXiv Detail & Related papers (2020-04-07T09:41:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.