Chaos, Extremism and Optimism: Volume Analysis of Learning in Games
- URL: http://arxiv.org/abs/2005.13996v1
- Date: Thu, 28 May 2020 13:47:09 GMT
- Title: Chaos, Extremism and Optimism: Volume Analysis of Learning in Games
- Authors: Yun Kuen Cheung and Georgios Piliouras
- Abstract summary: We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
- Score: 55.24050445142637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present volume analyses of Multiplicative Weights Updates (MWU) and
Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as
coordination games. Such analyses provide new insights into these game
dynamical systems, which seem hard to achieve via the classical techniques
within Computer Science and Machine Learning.
The first step is to examine these dynamics not in their original space
(simplex of actions) but in a dual space (aggregate payoff space of actions).
The second step is to explore how the volume of a set of initial conditions
evolves over time when it is pushed forward according to the algorithm. This is
reminiscent of approaches in Evolutionary Game Theory where replicator
dynamics, the continuous-time analogue of MWU, is known to always preserve
volume in all games. Interestingly, when we examine discrete-time dynamics,
both the choice of the game and the choice of the algorithm play a critical
role. So whereas MWU expands volume in zero-sum games and is thus Lyapunov
chaotic, we show that OMWU contracts volume, providing an alternative
understanding for its known convergent behavior. However, we also prove a
no-free-lunch type of theorem, in the sense that when examining coordination
games the roles are reversed: OMWU expands volume exponentially fast, whereas
MWU contracts.
Using these tools, we prove two novel, rather negative properties of MWU in
zero-sum games: (1) Extremism: even in games with unique fully mixed Nash
equilibrium, the system recurrently gets stuck near pure-strategy profiles,
despite them being clearly unstable from game theoretic perspective. (2)
Unavoidability: given any set of good points (with your own interpretation of
"good"), the system cannot avoid bad points indefinitely.
Related papers
- Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms [71.73971094342349]
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games.
We show that OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix.
We prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue.
arXiv Detail & Related papers (2024-06-15T13:26:17Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Last-Iterate Convergence with Full and Noisy Feedback in Two-Player
Zero-Sum Games [8.037452358542465]
We show that M2WU converges to an exact Nash equilibrium by iteratively adapting the mutation term.
We empirically confirm that M2WU outperforms MWU and OMWU in exploitability and convergence rates.
arXiv Detail & Related papers (2022-08-21T09:36:21Z) - Optimal No-Regret Learning in General Games: Bounded Regret with
Unbounded Step-Sizes via Clairvoyant MWU [27.438327151960657]
We provide an algorithm that achieves constant regret with fixed step-sizes.
The cumulative regret of our algorithm provably decreases linearly as the step-size increases.
arXiv Detail & Related papers (2021-11-29T17:42:24Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Consensus Multiplicative Weights Update: Learning to Learn using
Projector-based Game Signatures [8.08640000394814]
We introduce the second such algorithm, textitConsensus MWU, for which we prove local convergence and show empirically that it enjoys faster and more robust convergence than OMWU.
Our algorithm shows the importance of a new object, the textitsimplex Hessian, as well as of the interaction of the game with the (eigen)space of vectors summing to zero.
arXiv Detail & Related papers (2021-06-04T17:26:54Z) - From Poincar\'e Recurrence to Convergence in Imperfect Information
Games: Finding Equilibrium via Regularization [49.368421783733815]
We show how adapting the reward can give strong convergence guarantees in monotone games.
We also show how this reward adaptation technique can be leveraged to build algorithms that converge exactly to the Nash equilibrium.
arXiv Detail & Related papers (2020-02-19T21:36:58Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.