Stability and Deviation Optimal Risk Bounds with Convergence Rate
$O(1/n)$
- URL: http://arxiv.org/abs/2103.12024v1
- Date: Mon, 22 Mar 2021 17:28:40 GMT
- Title: Stability and Deviation Optimal Risk Bounds with Convergence Rate
$O(1/n)$
- Authors: Yegor Klochkov and Nikita Zhivotovskiy
- Abstract summary: We show a high probability excess risk bound with the rate $O(log n/n)$ for strongly convex and Lipschitz losses valid for emphany empirical risk minimization method.
We discuss how $O(log n/n)$ high probability excess risk bounds are possible for projected gradient descent in the case of strongly convex and Lipschitz losses without the usual smoothness assumption.
- Score: 4.1499725848998965
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The sharpest known high probability generalization bounds for uniformly
stable algorithms (Feldman, Vondr\'{a}k, 2018, 2019), (Bousquet, Klochkov,
Zhivotovskiy, 2020) contain a generally inevitable sampling error term of order
$\Theta(1/\sqrt{n})$. When applied to excess risk bounds, this leads to
suboptimal results in several standard stochastic convex optimization problems.
We show that if the so-called Bernstein condition is satisfied, the term
$\Theta(1/\sqrt{n})$ can be avoided, and high probability excess risk bounds of
order up to $O(1/n)$ are possible via uniform stability. Using this result, we
show a high probability excess risk bound with the rate $O(\log n/n)$ for
strongly convex and Lipschitz losses valid for \emph{any} empirical risk
minimization method. This resolves a question of Shalev-Shwartz, Shamir,
Srebro, and Sridharan (2009). We discuss how $O(\log n/n)$ high probability
excess risk bounds are possible for projected gradient descent in the case of
strongly convex and Lipschitz losses without the usual smoothness assumption.
Related papers
- Stability and Sharper Risk Bounds with Convergence Rate $O(1/n^2)$ [23.380477456114118]
The sharpest known high probability excess risk bounds are up to $Oleft( 1/nright)$ for empirical risk minimization and projected descent via algorithmic stability.
arXiv Detail & Related papers (2024-10-13T07:50:47Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Optimal Excess Risk Bounds for Empirical Risk Minimization on $p$-Norm Linear Regression [19.31269916674961]
We show that, in the realizable case, under no moment assumptions, $O(d)$ samples are enough to exactly recover the target.
We extend this result to the case $p in (1, 2)$ under mild assumptions that guarantee the existence of the Hessian of the risk at its minimizer.
arXiv Detail & Related papers (2023-10-19T03:21:28Z) - Estimating Optimal Policy Value in General Linear Contextual Bandits [50.008542459050155]
In many bandit problems, the maximal reward achievable by a policy is often unknown in advance.
We consider the problem of estimating the optimal policy value in the sublinear data regime before the optimal policy is even learnable.
We present a more practical, computationally efficient algorithm that estimates a problem-dependent upper bound on $V*$.
arXiv Detail & Related papers (2023-02-19T01:09:24Z) - Private Stochastic Optimization With Large Worst-Case Lipschitz Parameter [14.040676498310198]
We study differentially private (DP) inequality optimization (SO) with loss functions whose worst-case Lipschitz parameter over all data may be infinite.
For smooth loss functions, we provide linear-time algorithms with state-of-the-art excess risk.
We also provide the first algorithm to handle non-optimal convex loss functions.
arXiv Detail & Related papers (2022-09-15T16:03:23Z) - Sharper Utility Bounds for Differentially Private Models [20.246768861331276]
First $mathcalObig(fracsqrtpnepsilonbig)$ high probability excess population risk bound for differentially private algorithms.
New proposed algorithm m-NGP improves the performance of the differentially private model over real datasets.
arXiv Detail & Related papers (2022-04-22T07:03:13Z) - High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
Stepsize [55.0090961425708]
We propose a new, simplified high probability analysis of AdaGrad for smooth, non- probability problems.
We present our analysis in a modular way and obtain a complementary $mathcal O (1 / TT)$ convergence rate in the deterministic setting.
To the best of our knowledge, this is the first high probability for AdaGrad with a truly adaptive scheme, i.e., completely oblivious to the knowledge of smoothness.
arXiv Detail & Related papers (2022-04-06T13:50:33Z) - Sharp Statistical Guarantees for Adversarially Robust Gaussian
Classification [54.22421582955454]
We provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification.
Results are stated in terms of the Adversarial Signal-to-Noise Ratio (AdvSNR), which generalizes a similar notion for standard linear classification to the adversarial setting.
arXiv Detail & Related papers (2020-06-29T21:06:52Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.