The Power of Regularization in Solving Extensive-Form Games
- URL: http://arxiv.org/abs/2206.09495v1
- Date: Sun, 19 Jun 2022 22:10:38 GMT
- Title: The Power of Regularization in Solving Extensive-Form Games
- Authors: Mingyang Liu, Asuman Ozdaglar, Tiancheng Yu, Kaiqing Zhang
- Abstract summary: We propose a series of new algorithms based on regularizing the payoff functions of the game.
In particular, we first show that dilated optimistic mirror descent (DOMD) can achieve a fast $tilde O(T)$ last-iterate convergence.
We also show that Reg-CFR, with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(T1/4)$ best-iterate, and $O(T3/4)$ average-iterate convergence rate.
- Score: 22.557806157585834
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate the power of regularization, a common technique
in reinforcement learning and optimization, in solving extensive-form games
(EFGs). We propose a series of new algorithms based on regularizing the payoff
functions of the game, and establish a set of convergence results that strictly
improve over the existing ones, with either weaker assumptions or stronger
convergence guarantees. In particular, we first show that dilated optimistic
mirror descent (DOMD), an efficient variant of OMD for solving EFGs, with
adaptive regularization can achieve a fast $\tilde O(1/T)$ last-iterate
convergence in terms of duality gap without the uniqueness assumption of the
Nash equilibrium (NE). Moreover, regularized dilated optimistic multiplicative
weights update (Reg-DOMWU), an instance of Reg-DOMD, further enjoys the $\tilde
O(1/T)$ last-iterate convergence rate of the distance to the set of NE. This
addresses an open question on whether iterate convergence can be obtained for
OMWU algorithms without the uniqueness assumption in both the EFG and
normal-form game literature. Second, we show that regularized counterfactual
regret minimization (Reg-CFR), with a variant of optimistic mirror descent
algorithm as regret-minimizer, can achieve $O(1/T^{1/4})$ best-iterate, and
$O(1/T^{3/4})$ average-iterate convergence rate for finding NE in EFGs.
Finally, we show that Reg-CFR can achieve asymptotic last-iterate convergence,
and optimal $O(1/T)$ average-iterate convergence rate, for finding the NE of
perturbed EFGs, which is useful for finding approximate extensive-form perfect
equilibria (EFPE). To the best of our knowledge, they constitute the first
last-iterate convergence results for CFR-type algorithms, while matching the
SOTA average-iterate convergence rate in finding NE for non-perturbed EFGs. We
also provide numerical results to corroborate the advantages of our algorithms.
Related papers
- Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods [25.831462008050387]
The Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding.
It still remains unclear whether the lastiterate convergence can be provably extended to wider composite optimization and non-Euclidean norms.
arXiv Detail & Related papers (2023-12-13T21:41:06Z) - Adaptive, Doubly Optimal No-Regret Learning in Strongly Monotone and Exp-Concave Games with Gradient Feedback [75.29048190099523]
Online gradient descent (OGD) is well known to be doubly optimal under strong convexity or monotonicity assumptions.
In this paper, we design a fully adaptive OGD algorithm, textsfAdaOGD, that does not require a priori knowledge of these parameters.
arXiv Detail & Related papers (2023-10-21T18:38:13Z) - Distributionally Robust Optimization with Bias and Variance Reduction [9.341215359733601]
We show that Prospect, a gradient-based algorithm, enjoys linear convergence for smooth regularized losses.
We also show that Prospect can converge 2-3$times$ faster than baselines such as gradient-based methods.
arXiv Detail & Related papers (2023-10-21T00:03:54Z) - Efficient Last-iterate Convergence Algorithms in Solving Games [20.00785679098103]
No-regret algorithms are popular for learning Nash equilibrium in two-player zero-sum normal-form games (NFGs) and extensive-form games (EFGs)
Recent works propose a Reward Transformation (RT) framework for MWU, which removes the uniqueness condition and achieves competitive performance with OMWU.
We show that the bottleneck of RT-based algorithms is the speed of solving convex-concave optimization problems (SCCPs)
arXiv Detail & Related papers (2023-08-22T07:59:49Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Regret Bounds for Expected Improvement Algorithms in Gaussian Process
Bandit Optimization [63.8557841188626]
The expected improvement (EI) algorithm is one of the most popular strategies for optimization under uncertainty.
We propose a variant of EI with a standard incumbent defined via the GP predictive mean.
We show that our algorithm converges, and achieves a cumulative regret bound of $mathcal O(gamma_TsqrtT)$.
arXiv Detail & Related papers (2022-03-15T13:17:53Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - Linear Last-iterate Convergence in Constrained Saddle-point Optimization [48.44657553192801]
We significantly expand the understanding of last-rate uniqueness for Optimistic Gradient Descent Ascent (OGDA) and Optimistic Multiplicative Weights Update (OMWU)
We show that when the equilibrium is unique, linear lastiterate convergence is achieved with a learning rate whose value is set to a universal constant.
We show that bilinear games over any polytope satisfy this condition and OGDA converges exponentially fast even without the unique equilibrium assumption.
arXiv Detail & Related papers (2020-06-16T20:53:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.