Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
- URL: http://arxiv.org/abs/2311.00676v2
- Date: Tue, 04 Mar 2025 18:13:47 GMT
- Title: Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
- Authors: Yang Cai, Gabriele Farina, Julien Grand-Clément, Christian Kroer, Chung-Wei Lee, Haipeng Luo, Weiqiang Zheng,
- Abstract summary: We study last-iterate convergence properties of algorithms for solving two-player zero-sum games based on Regret$+$ (RM$+$)<n>Our analysis builds on a new characterization of the geometric structure of the limit points of our algorithms, marking a significant departure from most of the literature on last-iterate convergence.
- Score: 71.73971094342349
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study last-iterate convergence properties of algorithms for solving two-player zero-sum games based on Regret Matching$^+$ (RM$^+$). Despite their widespread use for solving real games, virtually nothing is known about their last-iterate convergence. A major obstacle to analyzing RM-type dynamics is that their regret operators lack Lipschitzness and (pseudo)monotonicity. We start by showing numerically that several variants used in practice, such as RM$^+$, predictive RM$^+$ and alternating RM$^+$, all lack last-iterate convergence guarantees even on a simple $3\times 3$ matrix game. We then prove that recent variants of these algorithms based on a smoothing technique, extragradient RM$^{+}$ and smooth Predictive RM$^+$, enjoy asymptotic last-iterate convergence (without a rate), $1/\sqrt{t}$ best-iterate convergence, and when combined with restarting, linear-rate last-iterate convergence. Our analysis builds on a new characterization of the geometric structure of the limit points of our algorithms, marking a significant departure from most of the literature on last-iterate convergence. We believe that our analysis may be of independent interest and offers a fresh perspective for studying last-iterate convergence in algorithms based on non-monotone operators.
Related papers
- On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games [71.73971094342349]
Non-ergodic convergence of learning dynamics in games is widely studied because of its importance in both theory and practice.
Recent work showed that a broad class of learning dynamics, including Optimistic Multiplicative Weights Update, can exhibit arbitrarily slow last-iterate convergence.
We show that OMWU achieves an $O(T-1/6)$ best-iterate convergence rate, in stark contrast to its slow last-iterate convergence in the same class of games.
arXiv Detail & Related papers (2025-03-04T17:49:24Z) - Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms [71.73971094342349]
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games.
We show that OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix.
We prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue.
arXiv Detail & Related papers (2024-06-15T13:26:17Z) - Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods [25.831462008050387]
The Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding.
It still remains unclear whether the lastiterate convergence can be provably extended to wider composite optimization and non-Euclidean norms.
arXiv Detail & Related papers (2023-12-13T21:41:06Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Regret Matching+: (In)Stability and Fast Convergence in Games [68.13214224119024]
We show that RM+ and its predictive version can be unstable, which might cause other players to suffer large regret.
We show that these fixes are sufficient to get $O(T1/4)$ individual regret and $O(1)$ social regret in normal-form games via RM+ with predictions.
arXiv Detail & Related papers (2023-05-24T04:26:21Z) - Doubly Optimal No-Regret Learning in Monotone Games [10.760195409078591]
We propose the first doubly optimal no-regret learning algorithm for smooth monotone games.
Our algorithm achieves both (i) the optimal $O(sqrtT)$ regret in the adversarial setting under smooth and convex loss functions and (ii) the optimal $O(frac1T)$ last-iterate convergence rate to a Nash equilibrium.
arXiv Detail & Related papers (2023-01-30T17:55:53Z) - Policy Optimization for Markov Games: Unified Framework and Faster
Convergence [81.3266426402464]
We show that the state-wise average policy of this algorithm converges to an approximate Nash equilibrium (NE) of the game.
We extend this algorithm to multi-player general Markov Games and show a $mathcalwidetildeO(T-1/2)$ convergence rate to Correlated Equilibria (CCE)
arXiv Detail & Related papers (2022-06-06T14:23:13Z) - Near-Optimal No-Regret Learning for Correlated Equilibria in
Multi-Player General-Sum Games [104.74734408204749]
We show that if all agents in a multi-player general-sum normal-form game employ Optimistic Multiplicative Weights Update (OMWU), the external regret of every player is $O(textrmpolylog(T))$ after $T$ repetitions of the game.
We extend their result from external regret to internal regret and swap regret, thereby establishing uncoupled learning dynamics that converge to an approximate correlated equilibrium.
arXiv Detail & Related papers (2021-11-11T01:19:53Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - On the Last Iterate Convergence of Momentum Methods [32.60494006038902]
We prove for the first time that for any constant momentum factor, there exists a Lipschitz and convex function for which the last iterate suffers from an error.
We show that in the setting with convex and smooth functions, our new SGDM algorithm automatically converges at a rate of $O(fraclog TT)$.
arXiv Detail & Related papers (2021-02-13T21:16:16Z) - Tight last-iterate convergence rates for no-regret learning in
multi-player games [31.604950465209335]
We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of $O (1/sqrtT)$ in smooth monotone games.
We also show that the $O (1/sqrtT)$ rate is tight for all $p$-SCLI algorithms, which includes OG as a special case.
arXiv Detail & Related papers (2020-10-26T17:06:19Z) - A Simple Convergence Proof of Adam and Adagrad [74.24716715922759]
We show a proof of convergence between the Adam Adagrad and $O(d(N)/st)$ algorithms.
Adam converges with the same convergence $O(d(N)/st)$ when used with the default parameters.
arXiv Detail & Related papers (2020-03-05T01:56:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.