Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization
- URL: http://arxiv.org/abs/2412.15956v1
- Date: Fri, 20 Dec 2024 14:50:47 GMT
- Title: Black-Box Uniform Stability for Non-Euclidean Empirical Risk Minimization
- Authors: Simon Vary, David MartÃnez-Rubio, Patrick Rebeschini,
- Abstract summary: We study first-order algorithms that are uniformly stable for empirical risk minimization problems.<n>We propose a black-box reduction method that turns an optimization algorithm for smooth convex losses into a uniformly stable learning algorithm.
- Score: 15.334392442475115
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study first-order algorithms that are uniformly stable for empirical risk minimization (ERM) problems that are convex and smooth with respect to $p$-norms, $p \geq 1$. We propose a black-box reduction method that, by employing properties of uniformly convex regularizers, turns an optimization algorithm for H\"older smooth convex losses into a uniformly stable learning algorithm with optimal statistical risk bounds on the excess risk, up to a constant factor depending on $p$. Achieving a black-box reduction for uniform stability was posed as an open question by (Attia and Koren, 2022), which had solved the Euclidean case $p=2$. We explore applications that leverage non-Euclidean geometry in addressing binary classification problems.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Stability and Generalization for Markov Chain Stochastic Gradient
Methods [49.981789906200035]
We provide a comprehensive generalization analysis of MC-SGMs for both minimization and minimax problems.
We establish the optimal excess population risk bounds for both smooth and non-smooth cases.
We develop the first nearly optimal convergence rates for convex-concave problems both in expectation and with high probability.
arXiv Detail & Related papers (2022-09-16T15:42:51Z) - Uniform Stability for First-Order Empirical Risk Minimization [33.593203156666746]
We consider the problem of designing uniformly stable first-order optimization algorithms for empirical risk minimization.
For Euclidean geometry, we suggest a black-box conversion which given a smooth optimization algorithm, produces a uniformly stable version of the algorithm while maintaining its convergence rate up to logarithmic factors.
For more general geometries, we develop a variant of Mirror Descent for smooth optimization with convergence rate $widetildeO (1/T)$ and uniform stability $O(T/n)$, leaving open the question of devising a general conversion method as in the Euclidean case.
arXiv Detail & Related papers (2022-07-17T18:53:50Z) - Non-Euclidean Differentially Private Stochastic Convex Optimization [15.302167005107135]
We show that noisy gradient descent (SGD) algorithms attain the optimal excess risk in low-dimensional regimes.
Our work draws upon concepts from the geometry of normed spaces, such as the notions of regularity, uniform convexity, and uniform smoothness.
arXiv Detail & Related papers (2021-03-01T19:48:44Z) - Large-Scale Methods for Distributionally Robust Optimization [53.98643772533416]
We prove that our algorithms require a number of evaluations gradient independent of training set size and number of parameters.
Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
arXiv Detail & Related papers (2020-10-12T17:41:44Z) - For interpolating kernel machines, minimizing the norm of the ERM
solution minimizes stability [20.775719987269003]
We show that the interpolating solution with minimum norm minimizes a bound on $mboxCV_loo$ stability.
Under the assumption of random kernel, the corresponding test error should be expected to follow a double descent curve.
arXiv Detail & Related papers (2020-06-28T05:54:03Z) - Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max
Optimization [61.66927778694731]
Epoch gradient descent method (a.k.a. Epoch-GD) proposed by Hazan and Kale (2011) was deemed a breakthrough for strongly convex minimization.
Our result is the first one that shows Epoch-GDA can achieve the optimal rate of $O (1/T)$ for the duality gap of SCSC min-max problems.
arXiv Detail & Related papers (2020-02-13T02:16:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.