Related papers: High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

URL: http://arxiv.org/abs/2302.00999v2
Date: Tue, 18 Jul 2023 14:19:07 GMT
Title: High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance
Authors: Abdurakhmon Sadiev, Marina Danilova, Eduard Gorbunov, Samuel Horv\'ath, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richt\'arik
Abstract summary: We propose algorithms with high-probability convergence results under less restrictive assumptions. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
Score: 59.211456992422136
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization methods has been growing. One of the main reasons for this is that high-probability complexity bounds are more accurate and less studied than in-expectation ones. However, SOTA high-probability non-asymptotic convergence results are derived under strong assumptions such as the boundedness of the gradient noise variance or of the objective's gradient itself. In this paper, we propose several algorithms with high-probability convergence results under less restrictive assumptions. In particular, we derive new high-probability convergence results under the assumption that the gradient/operator noise has bounded central $\alpha$-th moment for $\alpha \in (1,2]$ in the following setups: (i) smooth non-convex / Polyak-Lojasiewicz / convex / strongly convex / quasi-strongly convex minimization problems, (ii) Lipschitz / star-cocoercive and monotone / quasi-strongly monotone variational inequalities. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes studied in stochastic optimization.

Related papers

Towards Weaker Variance Assumptions for Stochastic Optimization [19.339358874690568]
We revisit a classical assumption for analyzing gradient algorithms where the squared norm of the subgradient is allowed to grow as fast as the squared norm of the optimization variable. We analyze convex problems with functional constraints or regularized convex-concave min-max problems. We obtain rates for optimality measures that do not require boundedness of the feasible set.
arXiv Detail & Related papers (2025-04-14T07:26:34Z)
Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods. In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z)
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise [96.80184504268593]
gradient, clipping is one of the key algorithmic ingredients to derive good high-probability guarantees. Clipping can spoil the convergence of the popular methods for composite and distributed optimization.
arXiv Detail & Related papers (2023-10-03T07:49:17Z)
Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys. Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z)
SGD with AdaGrad Stepsizes: Full Adaptivity with High Probability to Unknown Parameters, Unbounded Gradients and Affine Variance [33.593203156666746]
We show that AdaGrad stepsizes a popular adaptive (self-tuning) method for first-order optimization. In both the low-noise and high-regimes we find sharp rates of convergence in both the low-noise and high-regimes.
arXiv Detail & Related papers (2023-02-17T09:46:08Z)
Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise [64.85879194013407]
We prove the first high-probability results with logarithmic dependence on the confidence level for methods for solving monotone and structured non-monotone VIPs. Our results match the best-known ones in the light-tails case and are novel for structured non-monotone problems. In addition, we numerically validate that the gradient noise of many practical formulations is heavy-tailed and show that clipping improves the performance of SEG/SGDA.
arXiv Detail & Related papers (2022-06-02T15:21:55Z)
High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability. Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level. We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z)
Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities [14.848525762485872]
We leverage the connections between nonexpansive maps, monotone Lipschitz operators, and proximal mappings to obtain near-optimal solutions to monotone inclusion problems. These results translate into near-optimal guarantees for approximating strong solutions to variational inequality problems, approximating convex-concave min-max optimization problems, and minimizing the norm of the gradient in min-max optimization problems.
arXiv Detail & Related papers (2020-02-20T17:12:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.