A Unified Analysis for the Subgradient Methods Minimizing Composite
Nonconvex, Nonsmooth and Non-Lipschitz Functions
- URL: http://arxiv.org/abs/2308.16362v1
- Date: Wed, 30 Aug 2023 23:34:11 GMT
- Title: A Unified Analysis for the Subgradient Methods Minimizing Composite
Nonconvex, Nonsmooth and Non-Lipschitz Functions
- Authors: Daoli Zhu and Lei Zhao and Shuzhong Zhang
- Abstract summary: We present a novel convergence analysis in context of non-Lipschitz and nonsmooth optimization problems.
Under any of the subgradient upper bounding conditions to be introduced in the paper, we show that $O (1/stqrT)$ holds in terms of the square gradient of the envelope function, which further improves to be $O (1/T)$ if, in addition, the uniform KL condition with $1/2$ exponents holds.
- Score: 8.960341489080609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we propose a proximal subgradient method (Prox-SubGrad) for
solving nonconvex and nonsmooth optimization problems without assuming
Lipschitz continuity conditions. A number of subgradient upper bounds and their
relationships are presented. By means of these upper bounding conditions, we
establish some uniform recursive relations for the Moreau envelopes for weakly
convex optimization. This uniform scheme simplifies and unifies the proof
schemes to establish rate of convergence for Prox-SubGrad without assuming
Lipschitz continuity. We present a novel convergence analysis in this context.
Furthermore, we propose some new stochastic subgradient upper bounding
conditions and establish convergence and iteration complexity rates for the
stochastic subgradient method (Sto-SubGrad) to solve non-Lipschitz and
nonsmooth stochastic optimization problems. In particular, for both
deterministic and stochastic subgradient methods on weakly convex optimization
problems without Lipschitz continuity, under any of the subgradient upper
bounding conditions to be introduced in the paper, we show that $O(1/\sqrt{T})$
convergence rate holds in terms of the square of gradient of the Moreau
envelope function, which further improves to be $O(1/{T})$ if, in addition, the
uniform KL condition with exponent $1/2$ holds.
Related papers
- Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity [50.25258834153574]
We focus on the class of (strongly) convex $(L0)$-smooth functions and derive new convergence guarantees for several existing methods.
In particular, we derive improved convergence rates for Gradient Descent with smoothnessed Gradient Clipping and for Gradient Descent with Polyak Stepsizes.
arXiv Detail & Related papers (2024-09-23T13:11:37Z) - Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys.
Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z) - Revisiting Subgradient Method: Complexity and Convergence Beyond Lipschitz Continuity [24.45688490844496]
Subgradient method is one of the most fundamental algorithmic schemes for nonsmooth optimization.
In this work, we first extend the typical iteration complexity results for the subgradient method to cover non-Lipschitz convex and weakly convex minimization.
arXiv Detail & Related papers (2023-05-23T15:26:36Z) - Projective Proximal Gradient Descent for A Class of Nonconvex Nonsmooth Optimization Problems: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property [19.988762532185884]
Non and nonsmooth optimization problems are important and challenging for learning.
In this paper, we show a new analysis showing fast convergence of PPGD.
arXiv Detail & Related papers (2023-04-20T17:39:24Z) - High-Probability Bounds for Stochastic Optimization and Variational
Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions.
These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z) - Gradient-Free Methods for Deterministic and Stochastic Nonsmooth
Nonconvex Optimization [94.19177623349947]
Non-smooth non optimization problems emerge in machine learning and business making.
Two core challenges impede the development of efficient methods with finitetime convergence guarantee.
Two-phase versions of GFM and SGFM are also proposed and proven to achieve improved large-deviation results.
arXiv Detail & Related papers (2022-09-12T06:53:24Z) - Randomized Coordinate Subgradient Method for Nonsmooth Composite
Optimization [11.017632675093628]
Coordinate-type subgradient methods for addressing nonsmooth problems are relatively underexplored due to the set of properties of the Lipschitz-type assumption.
arXiv Detail & Related papers (2022-06-30T02:17:11Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Cyclic Coordinate Dual Averaging with Extrapolation [22.234715500748074]
We introduce a new block coordinate method that applies to the general class of variational inequality (VI) problems with monotone operators.
The resulting convergence bounds match the optimal convergence bounds of full gradient methods.
For $m$ coordinate blocks, the resulting gradient Lipschitz constant in our bounds is never larger than a factor $sqrtm$ compared to the traditional Euclidean Lipschitz constant.
arXiv Detail & Related papers (2021-02-26T00:28:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.