Last-iterate Convergence in Extensive-Form Games
- URL: http://arxiv.org/abs/2106.14326v1
- Date: Sun, 27 Jun 2021 22:02:26 GMT
- Title: Last-iterate Convergence in Extensive-Form Games
- Authors: Chung-Wei Lee, Christian Kroer, Haipeng Luo
- Abstract summary: We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
- Score: 49.31256241275577
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Regret-based algorithms are highly efficient at finding approximate Nash
equilibria in sequential games such as poker games. However, most regret-based
algorithms, including counterfactual regret minimization (CFR) and its
variants, rely on iterate averaging to achieve convergence. Inspired by recent
advances on last-iterate convergence of optimistic algorithms in zero-sum
normal-form games, we study this phenomenon in sequential games, and provide a
comprehensive study of last-iterate convergence for zero-sum extensive-form
games with perfect recall (EFGs), using various optimistic regret-minimization
algorithms over treeplexes. This includes algorithms using the vanilla entropy
or squared Euclidean norm regularizers, as well as their dilated versions which
admit more efficient implementation. In contrast to CFR, we show that all of
these algorithms enjoy last-iterate convergence, with some of them even
converging exponentially fast. We also provide experiments to further support
our theoretical results.
Related papers
- Last-Iterate Convergence Properties of Regret-Matching Algorithms in
Games [72.43065138581589]
We study the last-iterate convergence properties of various popular variants of RM$+$.
We show numerically that several practical variants such as simultaneous RM$+$, alternating RM$+$, and predictive RM$+$, all lack last-iterate convergence guarantees.
We then prove that recent variants of these algorithms based on a smoothing technique do enjoy last-iterate convergence.
arXiv Detail & Related papers (2023-11-01T17:34:58Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Efficient Last-iterate Convergence Algorithms in Solving Games [20.00785679098103]
No-regret algorithms are popular for learning Nash equilibrium in two-player zero-sum normal-form games (NFGs) and extensive-form games (EFGs)
Recent works propose a Reward Transformation (RT) framework for MWU, which removes the uniqueness condition and achieves competitive performance with OMWU.
We show that the bottleneck of RT-based algorithms is the speed of solving convex-concave optimization problems (SCCPs)
arXiv Detail & Related papers (2023-08-22T07:59:49Z) - The Power of Regularization in Solving Extensive-Form Games [22.557806157585834]
We propose a series of new algorithms based on regularizing the payoff functions of the game.
In particular, we first show that dilated optimistic mirror descent (DOMD) can achieve a fast $tilde O(T)$ last-iterate convergence.
We also show that Reg-CFR, with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(T1/4)$ best-iterate, and $O(T3/4)$ average-iterate convergence rate.
arXiv Detail & Related papers (2022-06-19T22:10:38Z) - HessianFR: An Efficient Hessian-based Follow-the-Ridge Algorithm for
Minimax Optimization [18.61046317693516]
HessianFR is an efficient Hessian-based Follow-the-Ridge algorithm with theoretical guarantees.
Experiments of training generative adversarial networks (GANs) have been conducted on both synthetic and real-world large-scale image datasets.
arXiv Detail & Related papers (2022-05-23T04:28:52Z) - Faster Game Solving via Predictive Blackwell Approachability: Connecting
Regret Matching and Mirror Descent [119.5481797273995]
Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the most prevalent regret minimizers in online convex optimization.
We show that RM and RM+ are the algorithms that result from running FTRL and OMD, respectively, to select the halfspace to force at all times in the underlying Blackwell approachability game.
In experiments across 18 common zero-sum extensive-form benchmark games, we show that predictive RM+ coupled with counterfactual regret minimization converges vastly faster than the fastest prior algorithms.
arXiv Detail & Related papers (2020-07-28T16:49:55Z) - Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games [116.0771177871705]
We characterize the finite-time last-iterate convergence rate for joint OGD learning on $lambda$-cocoercive games.
We show, via a novel double-stopping time technique, that this adaptive algorithm achieves same finite-time last-iterate convergence rate as non-adaptive counterpart.
arXiv Detail & Related papers (2020-02-23T01:46:34Z) - Stochastic Regret Minimization in Extensive-Form Games [109.43344748069933]
Monte-Carlo counterfactual regret minimization (MCCFR) is the state-of-the-art algorithm for solving sequential games that are too large for full trees.
We develop a new framework for developing regret minimization methods.
We show extensive experiments on three games, where some variants of our methods outperform MCCFR.
arXiv Detail & Related papers (2020-02-19T23:05:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.