From Poincar\'e Recurrence to Convergence in Imperfect Information
Games: Finding Equilibrium via Regularization
- URL: http://arxiv.org/abs/2002.08456v1
- Date: Wed, 19 Feb 2020 21:36:58 GMT
- Title: From Poincar\'e Recurrence to Convergence in Imperfect Information
Games: Finding Equilibrium via Regularization
- Authors: Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan
Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David
Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls
- Abstract summary: We show how adapting the reward can give strong convergence guarantees in monotone games.
We also show how this reward adaptation technique can be leveraged to build algorithms that converge exactly to the Nash equilibrium.
- Score: 49.368421783733815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we investigate the Follow the Regularized Leader dynamics in
sequential imperfect information games (IIG). We generalize existing results of
Poincar\'e recurrence from normal-form games to zero-sum two-player imperfect
information games and other sequential game settings. We then investigate how
adapting the reward (by adding a regularization term) of the game can give
strong convergence guarantees in monotone games. We continue by showing how
this reward adaptation technique can be leveraged to build algorithms that
converge exactly to the Nash equilibrium. Finally, we show how these insights
can be directly used to build state-of-the-art model-free algorithms for
zero-sum two-player Imperfect Information Games (IIG).
Related papers
- Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Abstracting Imperfect Information Away from Two-Player Zero-Sum Games [85.27865680662973]
Nayyar et al. (2013) showed that imperfect information can be abstracted away from common-payoff games by having players publicly announce their policies as they play.
This work shows that certain regularized equilibria do not possess the aforementioned non-correspondence problem.
Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games.
arXiv Detail & Related papers (2023-01-22T16:54:06Z) - Last-iterate Convergence in Extensive-Form Games [49.31256241275577]
We study last-iterate convergence of optimistic algorithms in sequential games.
We show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast.
arXiv Detail & Related papers (2021-06-27T22:02:26Z) - Computing Nash Equilibria in Multiplayer DAG-Structured Stochastic Games
with Persistent Imperfect Information [1.7132914341329848]
We present an algorithm for approximating Nash equilibrium in multiplayer general-suming games with persistent imperfect information.
Using a new procedure, we are able to demonstrate that our algorithm computes a strategy that closely approximates Nash equilibrium in this game.
arXiv Detail & Related papers (2020-10-26T19:27:26Z) - Exponential Convergence of Gradient Methods in Concave Network Zero-sum
Games [6.129776019898013]
We study the computation of Nash equilibrium in concave network zero-sum games (NZSGs)
We show that various game theoretic properties of convex-concave two-player zero-sum games are preserved in this generalization.
arXiv Detail & Related papers (2020-07-10T16:56:56Z) - Chaos, Extremism and Optimism: Volume Analysis of Learning in Games [55.24050445142637]
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
arXiv Detail & Related papers (2020-05-28T13:47:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.