Revisiting Extragradient-Type Methods -- Part 1: Generalizations and Sublinear Convergence Rates
- URL: http://arxiv.org/abs/2409.16859v1
- Date: Wed, 25 Sep 2024 12:14:05 GMT
- Title: Revisiting Extragradient-Type Methods -- Part 1: Generalizations and Sublinear Convergence Rates
- Authors: Quoc Tran-Dinh, Nghia Nguyen-Trung,
- Abstract summary: This paper presents a comprehensive analysis of the well-known extragradient (EG) method for solving both equations and inclusions.
We analyze both sublinear best-iterate'' and last-iterate'' convergence rates for the entire class of algorithms.
We extend our EG framework above to monotone'' inclusions, introducing a new class of algorithms and its corresponding convergence results.
- Score: 6.78476672849813
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper presents a comprehensive analysis of the well-known extragradient (EG) method for solving both equations and inclusions. First, we unify and generalize EG for [non]linear equations to a wider class of algorithms, encompassing various existing schemes and potentially new variants. Next, we analyze both sublinear ``best-iterate'' and ``last-iterate'' convergence rates for the entire class of algorithms, and derive new convergence results for two well-known instances. Second, we extend our EG framework above to ``monotone'' inclusions, introducing a new class of algorithms and its corresponding convergence results. Third, we also unify and generalize Tseng's forward-backward-forward splitting (FBFS) method to a broader class of algorithms to solve [non]linear inclusions when a weak-Minty solution exists, and establish its ``best-iterate'' convergence rate. Fourth, to complete our picture, we also investigate sublinear rates of two other common variants of EG using our EG analysis framework developed here: the reflected forward-backward splitting and the golden ratio methods. Finally, we conduct an extensive numerical experiment to validate our theoretical findings. Our results demonstrate that several new variants of our proposed algorithms outperform existing schemes in the majority of examples.
Related papers
- Sublinear Convergence Rates of Extragradient-Type Methods: A Survey on
Classical and Recent Developments [12.995632804090198]
The extragradient (EG) is a well-known method to approximate solutions of saddle-point problems.
Recently, these methods have gained popularity due to new applications in machine learning and robust optimization.
We provide a unified convergence analysis for different classes of algorithms, with an emphasis on sublinear best-iterate and last-iterate convergence rates.
arXiv Detail & Related papers (2023-03-30T07:04:22Z) - Efficient Alternating Minimization Solvers for Wyner Multi-View
Unsupervised Learning [0.0]
We propose two novel formulations that enable the development of computational efficient solvers based the alternating principle.
The proposed solvers offer computational efficiency, theoretical convergence guarantees, local minima complexity with the number of views, and exceptional accuracy as compared with the state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-28T10:17:51Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - On the Convergence of Distributed Stochastic Bilevel Optimization
Algorithms over a Network [55.56019538079826]
Bilevel optimization has been applied to a wide variety of machine learning models.
Most existing algorithms restrict their single-machine setting so that they are incapable of handling distributed data.
We develop novel decentralized bilevel optimization algorithms based on a gradient tracking communication mechanism and two different gradients.
arXiv Detail & Related papers (2022-06-30T05:29:52Z) - First-Order Algorithms for Nonlinear Generalized Nash Equilibrium
Problems [88.58409977434269]
We consider the problem of computing an equilibrium in a class of nonlinear generalized Nash equilibrium problems (NGNEPs)
Our contribution is to provide two simple first-order algorithmic frameworks based on the quadratic penalty method and the augmented Lagrangian method.
We provide nonasymptotic theoretical guarantees for these algorithms.
arXiv Detail & Related papers (2022-04-07T00:11:05Z) - Amortized Implicit Differentiation for Stochastic Bilevel Optimization [53.12363770169761]
We study a class of algorithms for solving bilevel optimization problems in both deterministic and deterministic settings.
We exploit a warm-start strategy to amortize the estimation of the exact gradient.
By using this framework, our analysis shows these algorithms to match the computational complexity of methods that have access to an unbiased estimate of the gradient.
arXiv Detail & Related papers (2021-11-29T15:10:09Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - A Unified Analysis of Stochastic Gradient Methods for Nonconvex
Federated Optimization [16.714109768541785]
We provide a single analysis for all methods that satisfy the SGD variants in the non non regime.
We also provide a unified analysis for obtaining faster linear convergence in the non non regime under PL condition.
arXiv Detail & Related papers (2020-06-12T08:58:03Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - A Unified Convergence Analysis for Shuffling-Type Gradient Methods [32.8097849940763]
We propose a unified convergence analysis for a generic gradient shufflingtype methods for solving finitesum problems.
Our results suggest some choices appropriate for training rates in certain neural shuffling variants.
arXiv Detail & Related papers (2020-02-19T15:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.