Regret Matching+: (In)Stability and Fast Convergence in Games
- URL: http://arxiv.org/abs/2305.14709v1
- Date: Wed, 24 May 2023 04:26:21 GMT
- Title: Regret Matching+: (In)Stability and Fast Convergence in Games
- Authors: Gabriele Farina and Julien Grand-Cl\'ement and Christian Kroer and
Chung-Wei Lee and Haipeng Luo
- Abstract summary: We show that RM+ and its predictive version can be unstable, which might cause other players to suffer large regret.
We show that these fixes are sufficient to get $O(T1/4)$ individual regret and $O(1)$ social regret in normal-form games via RM+ with predictions.
- Score: 68.13214224119024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regret Matching+ (RM+) and its variants are important algorithms for solving
large-scale games. However, a theoretical understanding of their success in
practice is still a mystery. Moreover, recent advances on fast convergence in
games are limited to no-regret algorithms such as online mirror descent, which
satisfy stability. In this paper, we first give counterexamples showing that
RM+ and its predictive version can be unstable, which might cause other players
to suffer large regret. We then provide two fixes: restarting and chopping off
the positive orthant that RM+ works in. We show that these fixes are sufficient
to get $O(T^{1/4})$ individual regret and $O(1)$ social regret in normal-form
games via RM+ with predictions. We also apply our stabilizing techniques to
clairvoyant updates in the uncoupled learning setting for RM+ and prove
desirable results akin to recent works for Clairvoyant online mirror descent.
Our experiments show the advantages of our algorithms over vanilla RM+-based
algorithms in matrix and extensive-form games.
Related papers
- Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms [71.73971094342349]
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games.
We show that OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix.
We prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue.
arXiv Detail & Related papers (2024-06-15T13:26:17Z) - Last-Iterate Convergence Properties of Regret-Matching Algorithms in
Games [72.43065138581589]
We study the last-iterate convergence properties of various popular variants of RM$+$.
We show numerically that several practical variants such as simultaneous RM$+$, alternating RM$+$, and predictive RM$+$, all lack last-iterate convergence guarantees.
We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence.
arXiv Detail & Related papers (2023-11-01T17:34:58Z) - Efficient Last-iterate Convergence Algorithms in Solving Games [20.00785679098103]
No-regret algorithms are popular for learning Nash equilibrium in two-player zero-sum normal-form games (NFGs) and extensive-form games (EFGs)
Recent works propose a Reward Transformation (RT) framework for MWU, which removes the uniqueness condition and achieves competitive performance with OMWU.
We show that the bottleneck of RT-based algorithms is the speed of solving convex-concave optimization problems (SCCPs)
arXiv Detail & Related papers (2023-08-22T07:59:49Z) - No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning.
We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures.
Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - Faster Game Solving via Predictive Blackwell Approachability: Connecting
Regret Matching and Mirror Descent [119.5481797273995]
Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the most prevalent regret minimizers in online convex optimization.
We show that RM and RM+ are the algorithms that result from running FTRL and OMD, respectively, to select the halfspace to force at all times in the underlying Blackwell approachability game.
In experiments across 18 common zero-sum extensive-form benchmark games, we show that predictive RM+ coupled with counterfactual regret minimization converges vastly faster than the fastest prior algorithms.
arXiv Detail & Related papers (2020-07-28T16:49:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.