Convergence analysis and acceleration of the smoothing methods for
solving extensive-form games
- URL: http://arxiv.org/abs/2303.11046v1
- Date: Mon, 20 Mar 2023 11:57:13 GMT
- Title: Convergence analysis and acceleration of the smoothing methods for
solving extensive-form games
- Authors: Keigo Habara, Ellen Hidemi Fukuda, Nobuo Yamashita
- Abstract summary: We consider an extended-form game with two players and zero-sum, i.e., the sum of their payoffs is always zero.
In such games, the problem of finding the optimal strategy can be formulated as a bilinear saddle-point problem.
To solve such large-scale bilinear saddle-point problems, the excessive gap technique (EGT), a smoothing method, has been studied.
Our goal is to improve the smoothing method for solving extensive-form games so that it can be applied to large-scale games.
- Score: 0.6875312133832078
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The extensive-form game has been studied considerably in recent years. It can
represent games with multiple decision points and incomplete information, and
hence it is helpful in formulating games with uncertain inputs, such as poker.
We consider an extended-form game with two players and zero-sum, i.e., the sum
of their payoffs is always zero. In such games, the problem of finding the
optimal strategy can be formulated as a bilinear saddle-point problem. This
formulation grows huge depending on the size of the game, since it has
variables representing the strategies at all decision points for each player.
To solve such large-scale bilinear saddle-point problems, the excessive gap
technique (EGT), a smoothing method, has been studied. This method generates a
sequence of approximate solutions whose error is guaranteed to converge at
$\mathcal{O}(1/k)$, where $k$ is the number of iterations. However, it has the
disadvantage of having poor theoretical bounds on the error related to the game
size. This makes it inapplicable to large games.
Our goal is to improve the smoothing method for solving extensive-form games
so that it can be applied to large-scale games. To this end, we make two
contributions in this work. First, we slightly modify the strongly convex
function used in the smoothing method in order to improve the theoretical
bounds related to the game size. Second, we propose a heuristic called
centering trick, which allows the smoothing method to be combined with other
methods and consequently accelerates the convergence in practice. As a result,
we combine EGT with CFR+, a state-of-the-art method for extensive-form games,
to achieve good performance in games where conventional smoothing methods do
not perform well. The proposed smoothing method is shown to have the potential
to solve large games in practice.
Related papers
- Function Approximation for Solving Stackelberg Equilibrium in Large
Perfect Information Games [115.77438739169155]
We propose learning the textitEnforceable Payoff Frontier (EPF) -- a generalization of the state value function for general-sum games.
This is the first method that applies FA to the Stackelberg setting, allowing us to scale to much larger games.
arXiv Detail & Related papers (2022-12-29T19:05:50Z) - Safe Subgame Resolving for Extensive Form Correlated Equilibrium [47.155175336085364]
Correlated Equilibrium is a solution concept that is more general than Nash Equilibrium (NE) and can lead to better social welfare.
We apply textitsubgame resolving, a technique extremely successful in finding NE in zero-sum games to solving general-sum EFCEs.
Subgame resolving refines a correlation plan in an textitonline manner: instead of solving for the full game upfront, it only solves for strategies in subgames that are reached in actual play.
arXiv Detail & Related papers (2022-12-29T14:20:48Z) - Representation Learning for General-sum Low-rank Markov Games [63.119870889883224]
We study multi-agent general-sum Markov games with nonlinear function approximation.
We focus on low-rank Markov games whose transition matrix admits a hidden low-rank structure on top of an unknown non-linear representation.
arXiv Detail & Related papers (2022-10-30T22:58:22Z) - HSVI can solve zero-sum Partially Observable Stochastic Games [7.293053431456775]
State-of-the-art methods for solving 2-player zero-sum imperfect games rely on linear programming or dynamic regret minimization.
We propose a novel family of promising approaches complementing those relying on linear programming or iterative methods.
arXiv Detail & Related papers (2022-10-26T11:41:57Z) - Learning in Multi-Player Stochastic Games [1.0878040851638]
We consider the problem of simultaneous learning in games with many players in the finite-horizon setting.
While the typical target solution for a game is a Nash equilibrium, this is intractable with many players.
We turn to a different target: algorithms which generate an equilibrium when they are used by all players.
arXiv Detail & Related papers (2022-10-25T19:02:03Z) - No-Regret Dynamics in the Fenchel Game: A Unified Framework for
Algorithmic Convex Optimization [20.718016474717196]
We develop an algorithmic framework for solving convex optimization problems using no-regret game dynamics.
A common choice for these strategies are so-called no-regret learning algorithms.
We show that many classical first-order methods for convex optimization can be interpreted as special cases of our framework.
arXiv Detail & Related papers (2021-11-22T16:10:18Z) - Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum
Markov Games [95.70078702838654]
This paper studies natural extensions of Natural Policy Gradient algorithm for solving two-player zero-sum games.
We thoroughly characterize the algorithms' performance in terms of the number of samples, number of iterations, concentrability coefficients, and approximation error.
arXiv Detail & Related papers (2021-02-17T17:49:57Z) - Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies
in Extensive-Form Zero-Sum Games [123.76716667704625]
We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game.
In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game.
We provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.
arXiv Detail & Related papers (2020-09-21T17:51:57Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z) - Accelerating Smooth Games by Manipulating Spectral Shapes [51.366219027745174]
We use matrix iteration theory to characterize acceleration in smooth games.
We describe gradient-based methods, such as extragradient, as transformations on the spectral shape.
We propose an optimal algorithm for bilinear games.
arXiv Detail & Related papers (2020-01-02T19:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.