VFOG: Variance-Reduced Fast Optimistic Gradient Methods for a Class of Nonmonotone Generalized Equations
- URL: http://arxiv.org/abs/2508.16791v1
- Date: Fri, 22 Aug 2025 20:46:29 GMT
- Title: VFOG: Variance-Reduced Fast Optimistic Gradient Methods for a Class of Nonmonotone Generalized Equations
- Authors: Quoc Tran-Dinh, Nghia Nguyen-Trung,
- Abstract summary: We develop a novel optimistic gradient-type algorithmic framework, combining both Nesterov's acceleration and variance-reduction techniques.<n>We show that our method achieves $mathcalO (1/k2)$ convergence rates in expectation on the squared norm of residual under the Lipschitz continuity.<n>We show that the sequence of iterates of our method almost surely converges to a solution of the underlying problem.
- Score: 3.6997773420183866
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We develop a novel optimistic gradient-type algorithmic framework, combining both Nesterov's acceleration and variance-reduction techniques, to solve a class of generalized equations involving possibly nonmonotone operators in data-driven applications. Our framework covers a wide class of stochastic variance-reduced schemes, including mini-batching, and control variate unbiased and biased estimators. We establish that our method achieves $\mathcal{O}(1/k^2)$ convergence rates in expectation on the squared norm of residual under the Lipschitz continuity and a ``co-hypomonotonicity-type'' assumptions, improving upon non-accelerated counterparts by a factor of $1/k$. We also prove faster $o(1/k^2)$ convergence rates, both in expectation and almost surely. In addition, we show that the sequence of iterates of our method almost surely converges to a solution of the underlying problem. We demonstrate the applicability of our method using general error bound criteria, covering mini-batch stochastic estimators as well as three well-known control variate estimators: loopless SVRG, SAGA, and loopless SARAH, for which the last three variants attain significantly better oracle complexity compared to existing methods. We validate our framework and theoretical results through two numerical examples. The preliminary results illustrate promising performance of our accelerated method over its non-accelerated counterparts.
Related papers
- From Inexact Gradients to Byzantine Robustness: Acceleration and Optimization under Similarity [12.097833603814252]
We show that Byzantine-robust distributed optimization can be cast as a general optimization with inexact gradient oracles.<n>We propose two optimization schemes to speed up the convergence.
arXiv Detail & Related papers (2026-02-03T09:56:23Z) - Closing the Approximation Gap of Partial AUC Optimization: A Tale of Two Formulations [121.39938773554523]
The Area Under the ROC Curve (AUC) is a pivotal evaluation metric in real-world scenarios with both class imbalance and decision constraints.<n>We present two simple instance-wise minimax reformulations to close the approximation gap of PAUC optimization.<n>The resulting algorithms enjoy a linear per-iteration computational complexity w.r.t. the sample size and a convergence rate of $O(-2/3)$ for typical one-way and two-way PAUCs.
arXiv Detail & Related papers (2025-12-01T02:52:33Z) - Revisiting Convergence: Shuffling Complexity Beyond Lipschitz Smoothness [50.78508362183774]
Shuffling-type gradient methods are favored in practice for their simplicity and rapid empirical performance.<n>Most require the Lipschitz condition, which is often not met in common machine learning schemes.
arXiv Detail & Related papers (2025-07-11T15:36:48Z) - Variance-Reduced Fast Operator Splitting Methods for Generalized Equations [8.0153031008486]
We develop two variance-reduced fast operator splitting methods to approximate solutions of a class of generalized equations.<n>Our approach integrates recent advances in accelerated operator splitting and fixed-point methods, co-hypomonotonicity, and variance reduction.
arXiv Detail & Related papers (2025-04-17T16:02:20Z) - Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models [56.92178753201331]
We tackle average-reward infinite-horizon POMDPs with an unknown transition model.<n>We present a novel and simple estimator that overcomes this barrier.
arXiv Detail & Related papers (2025-01-30T22:29:41Z) - Accelerated Extragradient-Type Methods -- Part 2: Generalization and Sublinear Convergence Rates under Co-Hypomonotonicity [6.78476672849813]
This paper studies two types of extragradient-based methods: anchored extragradient and Nesterov's accelerated extragradient.<n>We unify and generalize a class of anchored extragradient methods for monotone inclusions to a wider range of schemes.<n>We propose another novel class of Nesterov's accelerated extragradient methods to solve inclusions.
arXiv Detail & Related papers (2025-01-08T16:06:15Z) - A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning.
This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z) - PROMISE: Preconditioned Stochastic Optimization Methods by Incorporating Scalable Curvature Estimates [17.777466668123886]
We introduce PROMISE ($textbfPr$econditioned $textbfO$ptimization $textbfM$ethods by $textbfI$ncorporating $textbfS$calable Curvature $textbfE$stimates), a suite of sketching-based preconditioned gradient algorithms.
PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha.
arXiv Detail & Related papers (2023-09-05T07:49:10Z) - A Primal-Dual Approach to Solving Variational Inequalities with General Constraints [54.62996442406718]
Yang et al. (2023) recently showed how to use first-order gradient methods to solve general variational inequalities.
We prove the convergence of this method and show that the gap function of the last iterate of the method decreases at a rate of $O(frac1sqrtK)$ when the operator is $L$-Lipschitz and monotone.
arXiv Detail & Related papers (2022-10-27T17:59:09Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Linearly Converging Error Compensated SGD [11.436753102510647]
We propose a unified analysis of variants of distributed SGD with arbitrary compressions and delayed updates.
Our framework is general enough to cover different variants of quantized SGD, ErrorCompensated SGD and SGD with delayed updates.
We develop new variants of SGD that combine variance reduction or arbitrary sampling with error feedback and quantization.
arXiv Detail & Related papers (2020-10-23T10:46:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.