Related papers: Convergence of Regret Matching in Potential Games and Constrained Optimization

Convergence of Regret Matching in Potential Games and Constrained Optimization

URL: http://arxiv.org/abs/2510.17067v1
Date: Mon, 20 Oct 2025 00:45:47 GMT
Title: Convergence of Regret Matching in Potential Games and Constrained Optimization
Authors: Ioannis Anagnostides, Emanuel Tewolde, Brian Hu Zhang, Ioannis Panageas, Vincent Conitzer, Tuomas Sandholm,
Abstract summary: We show that alternating RM$+$ converges to an $epsilon$-KKT point after $O_epsilon (1/epsilon4)$, establishing for the first time that it is a sound and fast first-order.<n>Our lower bound shows that convergence to coarse correlated equilibria in potential games is exponentially faster than convergence to Nash equilibria.
Score: 85.55969013318627
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Regret matching (RM} -- and its modern variants -- is a foundational online algorithm that has been at the heart of many AI breakthrough results in solving benchmark zero-sum games, such as poker. Yet, surprisingly little is known so far in theory about its convergence beyond two-player zero-sum games. For example, whether regret matching converges to Nash equilibria in potential games has been an open problem for two decades. Even beyond games, one could try to use RM variants for general constrained optimization problems. Recent empirical evidence suggests that they -- particularly regret matching$^+$ (RM$^+$) -- attain strong performance on benchmark constrained optimization problems, outperforming traditional gradient descent-type algorithms. We show that alternating RM$^+$ converges to an $\epsilon$-KKT point after $O_\epsilon(1/\epsilon^4)$ iterations, establishing for the first time that it is a sound and fast first-order optimizer. Our argument relates the KKT gap to the accumulated regret, two quantities that are entirely disparate in general but interact in an intriguing way in our setting, so much so that when regrets are bounded, our complexity bound improves all the way to $O_\epsilon(1/\epsilon^2)$. From a technical standpoint, while RM$^+$ does not have the usual one-step improvement property in general, we show that it does in a certain region that the algorithm will quickly reach and remain in thereafter. In sharp contrast, our second main result establishes a lower bound: RM, with or without alternation, can take an exponential number of iterations to reach a crude approximate solution even in two-player potential games. This represents the first worst-case separation between RM and RM$^+$. Our lower bound shows that convergence to coarse correlated equilibria in potential games is exponentially faster than convergence to Nash equilibria.

Related papers

Scale-Invariant Regret Matching and Online Learning with Optimal Convergence: Bridging Theory and Practice in Zero-Sum Games [60.871651115241406]
A considerable chasm has been looming for decades between theory and practice in zero-sum game solving through first-order methods.<n>We propose a new scale-invariance and parameter-free variant of PRM$+$, which we call IREG-PRM$+$.<n>We show that it achieves $T-1/2$ best-iterate and $T-1$ optimal convergence guarantees, while also being on par with PRM$+$ on benchmark games.
arXiv Detail & Related papers (2025-10-06T00:33:20Z)
Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback [60.610120215789976]
We show that when a pure strategy Nash equilibrium exists, $c$ becomes zero, leading to an optimal instance-dependent regret bound.<n>Our algorithm also enjoys last-iterate convergence and can identify the pure strategy Nash equilibrium with near-optimal sample.
arXiv Detail & Related papers (2025-02-24T20:20:06Z)
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms [71.73971094342349]
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games.<n>We show that OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix.<n>We prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue.
arXiv Detail & Related papers (2024-06-15T13:26:17Z)
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games [71.73971094342349]
We study last-iterate convergence properties of algorithms for solving two-player zero-sum games based on Regret$+$ (RM$+$)<n>Our analysis builds on a new characterization of the geometric structure of the limit points of our algorithms, marking a significant departure from most of the literature on last-iterate convergence.
arXiv Detail & Related papers (2023-11-01T17:34:58Z)
Regret Matching+: (In)Stability and Fast Convergence in Games [68.13214224119024]
We show that RM+ and its predictive version can be unstable, which might cause other players to suffer large regret. We show that these fixes are sufficient to get $O(T1/4)$ individual regret and $O(1)$ social regret in normal-form games via RM+ with predictions.
arXiv Detail & Related papers (2023-05-24T04:26:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.