Online Learning in Periodic Zero-Sum Games
- URL: http://arxiv.org/abs/2111.03377v1
- Date: Fri, 5 Nov 2021 10:36:16 GMT
- Title: Online Learning in Periodic Zero-Sum Games
- Authors: Tanner Fiez, Ryann Sim, Stratis Skoulakis, Georgios Piliouras, Lillian
Ratliff
- Abstract summary: We show that Poincar'e recurrence provably generalizes despite the complex, non-autonomous nature of these dynamical systems.
- Score: 27.510231246176033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A seminal result in game theory is von Neumann's minmax theorem, which states
that zero-sum games admit an essentially unique equilibrium solution. Classical
learning results build on this theorem to show that online no-regret dynamics
converge to an equilibrium in a time-average sense in zero-sum games. In the
past several years, a key research direction has focused on characterizing the
day-to-day behavior of such dynamics. General results in this direction show
that broad classes of online learning dynamics are cyclic, and formally
Poincar\'{e} recurrent, in zero-sum games. We analyze the robustness of these
online learning behaviors in the case of periodic zero-sum games with a
time-invariant equilibrium. This model generalizes the usual repeated game
formulation while also being a realistic and natural model of a repeated
competition between players that depends on exogenous environmental variations
such as time-of-day effects, week-to-week trends, and seasonality.
Interestingly, time-average convergence may fail even in the simplest such
settings, in spite of the equilibrium being fixed. In contrast, using novel
analysis methods, we show that Poincar\'{e} recurrence provably generalizes
despite the complex, non-autonomous nature of these dynamical systems.
Related papers
- Swim till You Sink: Computing the Limit of a Game [26.785274326413585]
We study the problem of computing the behavior of a class of natural dynamics called the noisy replicator dynamics.
We show through experiments that the limit distribution of reasonably large games can be estimated quite accurately through sampling and simulation.
arXiv Detail & Related papers (2024-08-20T19:09:21Z) - A geometric decomposition of finite games: Convergence vs. recurrence under exponential weights [24.800126996235512]
We decompose games into simpler components where the dynamics' long-run behavior is well understood.
In particular, the dynamics of exponential / multiplicative weights (EW) schemes is not compatible with the Euclidean underpinnings of Helmholtz's theorem.
We establish a deep connection with a well-known decomposition of games into a potential and harmonic component.
arXiv Detail & Related papers (2024-05-12T08:58:35Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - No-Regret Learning in Games is Turing Complete [33.03065693224728]
We prove Turing completeness of the replicator dynamic on matrix games, one of the simplest possible settings.
Our results imply the undecicability of reachability problems for learning algorithms in games, a special case of which is determining equilibrium convergence.
arXiv Detail & Related papers (2022-02-24T02:37:50Z) - Simple Uncoupled No-Regret Learning Dynamics for Extensive-Form
Correlated Equilibrium [65.64512759706271]
We study the existence of simple, uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games.
We introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games.
We give an efficient no-regret algorithm which guarantees with high probability that trigger regrets grow sublinearly in the number of iterations.
arXiv Detail & Related papers (2021-04-04T02:26:26Z) - Follow-the-Regularized-Leader Routes to Chaos in Routing Games [23.497377573947382]
We study the emergence of chaotic behavior of Follow-the-Regularized Leader (FoReL) dynamics in games.
We show the existence of novel non-standard phenomena such as the coexistence of stable Nash equilibria and chaos in the same game.
Although FoReL dynamics can be strange and non-equilibrating, we prove that the time average still converges to an exact equilibrium for any choice of learning rate and any scale of costs.
arXiv Detail & Related papers (2021-02-16T06:40:31Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - No-regret learning and mixed Nash equilibria: They do not mix [64.37511607254115]
We study the dynamics of "follow-the-regularized-leader" (FTRL)
We show that any Nash equilibrium which is not strict cannot be stable and attracting under FTRL.
This result has significant implications for predicting the outcome of a learning process.
arXiv Detail & Related papers (2020-10-19T13:49:06Z) - Chaos, Extremism and Optimism: Volume Analysis of Learning in Games [55.24050445142637]
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
arXiv Detail & Related papers (2020-05-28T13:47:09Z) - No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium [76.78447814623665]
We give the first uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games.
We introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games.
Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions.
arXiv Detail & Related papers (2020-04-01T17:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.