Related papers: Last-Iterate Convergence of Adaptive Riemannian Gradient Descent for Equilibrium Computation

Last-Iterate Convergence of Adaptive Riemannian Gradient Descent for Equilibrium Computation

URL: http://arxiv.org/abs/2306.16617v2
Date: Mon, 10 Nov 2025 01:07:30 GMT
Title: Last-Iterate Convergence of Adaptive Riemannian Gradient Descent for Equilibrium Computation
Authors: Yang Cai, Michael I. Jordan, Tianyi Lin, Argyris Oikonomou, Emmanouil-Vasileios Vlatakis-Gkaragkounis,
Abstract summary: This paper establishes new convergence results for textitgeodesic strongly monotone games.<n>Our key result shows that RGD attains last-iterate linear convergence in a textitgeometry-agnostic fashion.<n>Overall, this paper presents the first geometry-agnostic last-iterate convergence analysis for games beyond the Euclidean settings.
Score: 52.73824786627612
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Equilibrium computation on Riemannian manifolds provides a unifying framework for numerous problems in machine learning and data analytics. One of the simplest yet most fundamental methods is Riemannian gradient descent (RGD). While its Euclidean counterpart has been extensively studied, it remains unclear how the manifold curvature affects RGD in game-theoretic settings. This paper addresses this gap by establishing new convergence results for \textit{geodesic strongly monotone} games. Our key result shows that RGD attains last-iterate linear convergence in a \textit{geometry-agnostic} fashion, a key property for applications in machine learning. We extend this guarantee to stochastic and adaptive variants -- SRGD and FARGD -- and establish that: (i) the sample complexity of SRGD is geometry-agnostic and optimal with respect to noise; (ii) FARGD matches the convergence rate of its non-adaptive counterpart up to constant factors, while avoiding reliance on the condition number. Overall, this paper presents the first geometry-agnostic last-iterate convergence analysis for games beyond the Euclidean settings, underscoring the surprising power of RGD -- despite its simplicity -- in solving a wide spectrum of machine learning problems.

Related papers

Guaranteed Noisy CP Tensor Recovery via Riemannian Optimization on the Segre Manifold [9.804487437104289]
We exploit the intrinsic geometry of rank-one tensors by casting the recovery task as an optimization problem over the Segre manifold.<n>We prove that RGD converges at a local linear rate, while RGN exhibits an initial local quadratic convergence phase that transitions to a linear rate as the iterates approach the statistical noise floor.
arXiv Detail & Related papers (2025-10-01T06:44:52Z)
Euclidean Distance Matrix Completion via Asymmetric Projected Gradient Descent [13.27202712518471]
This paper proposes and analyzes a gradient-type algorithm based on Burer-Monteiro factorization, called the Asymmetric Descented Gradient (APGD)
arXiv Detail & Related papers (2025-04-28T07:13:23Z)
Riemannian Federated Learning via Averaging Gradient Stream [8.75592575216789]
This paper develops and analyzes an efficient Federated Averaging Gradient Stream (RFedAGS) algorithm. Numerical simulations conducted on synthetic and real-world data demonstrate the performance of the proposed RFedAGS.
arXiv Detail & Related papers (2024-09-11T12:28:42Z)
Kernel-Based Differentiable Learning of Non-Parametric Directed Acyclic Graphical Models [17.52142371968811]
Causal discovery amounts to learning a directed acyclic graph (DAG) that encodes a causal model. Recent research has sought to bypass the search by reformulating causal discovery as a continuous optimization problem.
arXiv Detail & Related papers (2024-08-20T16:09:40Z)
Convergence and Complexity Guarantee for Inexact First-order Riemannian Optimization Algorithms [18.425648833592312]
We show that tBMM converges to an $ilon$-stationary point within $O(epsilon-2)$. Under a mild iteration, the results still hold when the subproblem is solved in iterations in each provided the total optimality gap is bounded.
arXiv Detail & Related papers (2024-05-05T22:53:14Z)
Intrinsic Bayesian Cramér-Rao Bound with an Application to Covariance Matrix Estimation [49.67011673289242]
This paper presents a new performance bound for estimation problems where the parameter to estimate lies in a smooth manifold. It induces a geometry for the parameter manifold, as well as an intrinsic notion of the estimation error measure.
arXiv Detail & Related papers (2023-11-08T15:17:13Z)
Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1. This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z)
CoLiDE: Concomitant Linear DAG Estimation [12.415463205960156]
We deal with the problem of learning acyclic graph structure from observational data to a linear equation. We propose a new convex score function for sparsity-aware learning DAGs.
arXiv Detail & Related papers (2023-10-04T15:32:27Z)
The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro. Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold. We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z)
First-Order Algorithms for Min-Max Optimization in Geodesic Metric Spaces [93.35384756718868]
min-max algorithms have been analyzed in the Euclidean setting. We prove that the extraiteient (RCEG) method corrected lastrate convergence at a linear rate.
arXiv Detail & Related papers (2022-06-04T18:53:44Z)
Differentially private Riemannian optimization [40.23168342389821]
We introduce a framework of differentially private. Ritual risk problem where the. parameter is constrained to a. Rockefellerian manifold. We show the efficacy of the proposed framework in several applications.
arXiv Detail & Related papers (2022-05-19T12:04:15Z)
Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm. We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence. We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z)
Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically. This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression. We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z)
Stochastic Zeroth-order Riemannian Derivative Estimation and Optimization [15.78743548731191]
We propose an oracle version of the Gaussian smoothing function to overcome the difficulty of non-linearity of manifold non-linearity. We demonstrate the applicability of our algorithms by results and real-world applications on black-box stiffness control for robotics and black-box attacks to neural networks.
arXiv Detail & Related papers (2020-03-25T06:58:19Z)
From Nesterov's Estimate Sequence to Riemannian Acceleration [52.99237600875452]
We develop an alternative analysis for Nesterov's estimate sequence technique that may also be of independent interest. Then, we extend this analysis to the Riemannian setting, localizing the key difficulty due to non-Euclidean structure into a certain metric distortion''
arXiv Detail & Related papers (2020-01-24T04:17:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.