The Power of Regularization in Solving Extensive-Form Games
- URL: http://arxiv.org/abs/2206.09495v1
- Date: Sun, 19 Jun 2022 22:10:38 GMT
- Title: The Power of Regularization in Solving Extensive-Form Games
- Authors: Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang
- Abstract summary: We propose a series of new algorithms based on regularizing the payoff functions of the game.
In particular, we first show that dilated optimistic mirror descent (DOMD) can achieve a fast $tilde O(T)$ last-iterate convergence.
We also show that Reg-CFR, with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(T1/4)$ best-iterate, and $O(T3/4)$ average-iterate convergence rate.
- Score: 22.557806157585834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate the power of regularization, a common technique
in reinforcement learning and optimization, in solving extensive-form games
(EFGs). We propose a series of new algorithms based on regularizing the payoff
functions of the game, and establish a set of convergence results that strictly
improve over the existing ones, with either weaker assumptions or stronger
convergence guarantees. In particular, we first show that dilated optimistic
mirror descent (DOMD), an efficient variant of OMD for solving EFGs, with
adaptive regularization can achieve a fast $\tilde O(1/T)$ last-iterate
convergence in terms of duality gap without the uniqueness assumption of the
Nash equilibrium (NE). Moreover, regularized dilated optimistic multiplicative
weights update (Reg-DOMWU), an instance of Reg-DOMD, further enjoys the $\tilde
O(1/T)$ last-iterate convergence rate of the distance to the set of NE. This
addresses an open question on whether iterate convergence can be obtained for
OMWU algorithms without the uniqueness assumption in both the EFG and
normal-form game literature. Second, we show that regularized counterfactual
regret minimization (Reg-CFR), with a variant of optimistic mirror descent
algorithm as regret-minimizer, can achieve $O(1/T^{1/4})$ best-iterate, and
$O(1/T^{3/4})$ average-iterate convergence rate for finding NE in EFGs.
Finally, we show that Reg-CFR can achieve asymptotic last-iterate convergence,
and optimal $O(1/T)$ average-iterate convergence rate, for finding the NE of
perturbed EFGs, which is useful for finding approximate extensive-form perfect
equilibria (EFPE). To the best of our knowledge, they constitute the first
last-iterate convergence results for CFR-type algorithms, while matching the
SOTA average-iterate convergence rate in finding NE for non-perturbed EFGs. We
also provide numerical results to corroborate the advantages of our algorithms.
Related papers
- Convergence Rate Analysis of LION [54.28350823319057]
LION converges iterations of $cal(sqrtdK-)$ measured by gradient Karush-Kuhn-T (sqrtdK-)$.
We show that LION can achieve lower loss and higher performance compared to standard SGD.
arXiv Detail & Related papers (2024-11-12T11:30:53Z) - A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization [83.12938977698988]
Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data.
Current approaches inadequately address the intrinsic optimization of the co-occurrence matrix $barA$ based on cosine similarity.
We propose a Non-Negative Generalized Category Discovery (NN-GCD) framework to address these deficiencies.
arXiv Detail & Related papers (2024-10-29T07:24:11Z) - Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods [25.831462008050387]
The Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding.
It still remains unclear whether the lastiterate convergence can be provably extended to wider composite optimization and non-Euclidean norms.
arXiv Detail & Related papers (2023-12-13T21:41:06Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Efficient Last-iterate Convergence Algorithms in Solving Games [20.00785679098103]
No-regret algorithms are popular for learning Nash equilibrium in two-player zero-sum normal-form games (NFGs) and extensive-form games (EFGs)
Recent works propose a Reward Transformation (RT) framework for MWU, which removes the uniqueness condition and achieves competitive performance with OMWU.
We show that the bottleneck of RT-based algorithms is the speed of solving convex-concave optimization problems (SCCPs)
arXiv Detail & Related papers (2023-08-22T07:59:49Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU)
We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant.
We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.