Some Primal-Dual Theory for Subgradient Methods for Strongly Convex Optimization
- URL: http://arxiv.org/abs/2305.17323v4
- Date: Thu, 27 Jun 2024 02:53:30 GMT
- Title: Some Primal-Dual Theory for Subgradient Methods for Strongly Convex Optimization
- Authors: Benjamin Grimmer, Danlin Li,
- Abstract summary: We consider subgradient methods for strongly convex but potentially nonsmooth non-Lipschitz optimization.
We provide new equivalent dual descriptions for the classic subgradient method, the proximal subgradient method, and the switching subgradient method.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider (stochastic) subgradient methods for strongly convex but potentially nonsmooth non-Lipschitz optimization. We provide new equivalent dual descriptions (in the style of dual averaging) for the classic subgradient method, the proximal subgradient method, and the switching subgradient method. These equivalences enable $O(1/T)$ convergence guarantees in terms of both their classic primal gap and a not previously analyzed dual gap for strongly convex optimization. Consequently, our theory provides these classic methods with simple, optimal stopping criteria and optimality certificates at no added computational cost. Our results apply to a wide range of stepsize selections and of non-Lipschitz ill-conditioned problems where the early iterations of the subgradient method may diverge exponentially quickly (a phenomenon which, to the best of our knowledge, no prior works address). Even in the presence of such undesirable behaviors, our theory still ensures and bounds eventual convergence.
Related papers
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - A Unified Analysis for the Subgradient Methods Minimizing Composite
Nonconvex, Nonsmooth and Non-Lipschitz Functions [8.960341489080609]
We present a novel convergence analysis in context of non-Lipschitz and nonsmooth optimization problems.
Under any of the subgradient upper bounding conditions to be introduced in the paper, we show that $O (1/stqrT)$ holds in terms of the square gradient of the envelope function, which further improves to be $O (1/T)$ if, in addition, the uniform KL condition with $1/2$ exponents holds.
arXiv Detail & Related papers (2023-08-30T23:34:11Z) - DRSOM: A Dimension Reduced Second-Order Method [13.778619250890406]
Under a trust-like framework, our method preserves the convergence of the second-order method while using only information in a few directions.
Theoretically, we show that the method has a local convergence and a global convergence rate of $O(epsilon-3/2)$ to satisfy the first-order and second-order conditions.
arXiv Detail & Related papers (2022-07-30T13:05:01Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
Non-uniform refinement of objective functions leads to emphNon-uniform Smoothness (NS) and emphNon-uniform Lojasiewicz inequality (NL)
New definitions inspire new geometry-aware first-order methods that converge to global optimality faster than the classical $Omega (1/t2)$ lower bounds.
arXiv Detail & Related papers (2021-05-13T04:23:07Z) - Recent Theoretical Advances in Non-Convex Optimization [56.88981258425256]
Motivated by recent increased interest in analysis of optimization algorithms for non- optimization in deep networks and other problems in data, we give an overview of recent results of theoretical optimization algorithms for non- optimization.
arXiv Detail & Related papers (2020-12-11T08:28:51Z) - Adaptive Gradient Methods Can Be Provably Faster than SGD after Finite
Epochs [25.158203665218164]
We show that adaptive gradient methods can be faster than random shuffling SGD after finite time.
To the best of our knowledge, it is the first to demonstrate that adaptive gradient methods can be faster than SGD after finite time.
arXiv Detail & Related papers (2020-06-12T09:39:47Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z) - A Unified Convergence Analysis for Shuffling-Type Gradient Methods [32.8097849940763]
We propose a unified convergence analysis for a generic gradient shufflingtype methods for solving finitesum problems.
Our results suggest some choices appropriate for training rates in certain neural shuffling variants.
arXiv Detail & Related papers (2020-02-19T15:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.