Convergence Analysis of the PAGE Stochastic Algorithm for Weakly Convex Finite-Sum Optimization
- URL: http://arxiv.org/abs/2509.00737v2
- Date: Sat, 20 Sep 2025 10:41:35 GMT
- Title: Convergence Analysis of the PAGE Stochastic Algorithm for Weakly Convex Finite-Sum Optimization
- Authors: Laurent Condat, Peter Richtárik,
- Abstract summary: The algorithm was designed to find stationary points of averages of work of this type.<n>It provides a continuous non-smooth case ($tautauL$) which improves between the general framework ($tautauL$) and the rate of change.
- Score: 56.57092765118707
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: PAGE, a stochastic algorithm introduced by Li et al. [2021], was designed to find stationary points of averages of smooth nonconvex functions. In this work, we study PAGE in the broad framework of $\tau$-weakly convex functions, which provides a continuous interpolation between the general nonconvex $L$-smooth case ($\tau = L$) and the convex case ($\tau = 0$). We establish new convergence rates for PAGE, showing that its complexity improves as $\tau$ decreases.
Related papers
- Stochastic Momentum Methods for Non-smooth Non-Convex Finite-Sum Coupled Compositional Optimization [68.22688819802622]
We propose a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.<n>By applying our hinge-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution, we achieve a new state-of-the-art complexity of $O(/epsilon)$ for finding an (nearly) $'level KKT solution.
arXiv Detail & Related papers (2025-06-03T06:31:59Z) - Adaptive Gradient Normalization and Independent Sampling for (Stochastic) Generalized-Smooth Optimization [19.000530691874516]
We show that existing algorithms are not fully adapted to generalized nonsmooth geometry.<n>Experiments show the fast convergence of sampling problems with our algorithm.
arXiv Detail & Related papers (2024-10-17T21:52:00Z) - Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.<n>Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - A Near-Optimal Single-Loop Stochastic Algorithm for Convex Finite-Sum Coupled Compositional Optimization [53.14532968909759]
We introduce an efficient single-loop primal-dual block-coordinate algorithm called ALEXR.<n>We establish the convergence rates of ALEXR in both convex and strongly convex cases under smoothness and non-smoothness conditions.<n> Experimental results on GDRO and partial Area Under the ROC Curve for cFCCO demonstrate the promising performance of our algorithm.
arXiv Detail & Related papers (2023-12-04T19:00:07Z) - Non-Smooth Weakly-Convex Finite-sum Coupled Compositional Optimization [42.861002114813864]
This paper investigates new families of compositional optimization problems, called $linebf n$on-underline underlinebf sakly underlinebf c$ompositional $underlineunderline.
arXiv Detail & Related papers (2023-10-05T01:01:09Z) - On Convergence of Incremental Gradient for Non-Convex Smooth Functions [63.51187646914962]
In machine learning and network optimization, algorithms like shuffle SGD are popular due to minimizing the number of misses and good cache.
This paper delves into the convergence properties SGD algorithms with arbitrary data ordering.
arXiv Detail & Related papers (2023-05-30T17:47:27Z) - Decentralized Inexact Proximal Gradient Method With Network-Independent
Stepsizes for Convex Composite Optimization [39.352542703876104]
This paper considers decentralized convex composite optimization over undirected and connected networks.
A novel CTA (Combine-Then-Adapt)-based decentralized algorithm is proposed under uncoordinated network-independent constant stepsizes.
arXiv Detail & Related papers (2023-02-07T03:50:38Z) - Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms.
With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem.
We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z) - Randomized Coordinate Subgradient Method for Nonsmooth Composite
Optimization [11.017632675093628]
Coordinate-type subgradient methods for addressing nonsmooth problems are relatively underexplored due to the set of properties of the Lipschitz-type assumption.
arXiv Detail & Related papers (2022-06-30T02:17:11Z) - Stochastic Zeroth order Descent with Structured Directions [10.604744518360464]
We introduce and analyze Structured Zeroth order Descent (SSZD), a finite difference approach that approximates a gradient on a set $lleq d directions, where $d is the dimension of the ambient space.
For convex convex we prove almost sure convergence of functions on $O( (d/l) k-c1/2$)$ for every $c1/2$, which is arbitrarily close to the one of the Gradient Descent (SGD) in terms of one number of iterations.
arXiv Detail & Related papers (2022-06-10T14:00:06Z) - Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave
Saddle-Point Problems with Bilinear Coupling [84.47780064014262]
We study a linear convex-concave saddle-point problem $min_xmax_y f(x) ytopmathbfA x - g(y)
arXiv Detail & Related papers (2021-12-30T20:31:46Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - Lower Complexity Bounds of Finite-Sum Optimization Problems: The Results
and Construction [18.65143269806133]
We consider Proximal Incremental First-order (PIFO) algorithms which have access to gradient and proximal oracle for each individual component.
We develop a novel approach for constructing adversarial problems, which partitions the tridiagonal matrix of classical examples into $n$ groups.
arXiv Detail & Related papers (2021-03-15T11:20:31Z) - A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis
and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure.
We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem.
We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.