Evolutionary Game Theory Squared: Evolving Agents in Endogenously
Evolving Zero-Sum Games
- URL: http://arxiv.org/abs/2012.08382v1
- Date: Tue, 15 Dec 2020 15:54:46 GMT
- Title: Evolutionary Game Theory Squared: Evolving Agents in Endogenously
Evolving Zero-Sum Games
- Authors: Stratis Skoulakis, Tanner Fiez, Ryann Sim, Georgios Piliouras, Lillian
Ratliff
- Abstract summary: This paper introduces and analyze a class of competitive settings where both the agents and the games they play evolve strategically over time.
Populations of agents compete against each other in a zero-sum competition that itself evolves adversarially to the current population mixture.
Remarkably, despite the chaotic coevolution of agents and games, we prove that the system exhibits a number of regularities.
- Score: 27.510231246176033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The predominant paradigm in evolutionary game theory and more generally
online learning in games is based on a clear distinction between a population
of dynamic agents that interact given a fixed, static game. In this paper, we
move away from the artificial divide between dynamic agents and static games,
to introduce and analyze a large class of competitive settings where both the
agents and the games they play evolve strategically over time. We focus on
arguably the most archetypal game-theoretic setting -- zero-sum games (as well
as network generalizations) -- and the most studied evolutionary learning
dynamic -- replicator, the continuous-time analogue of multiplicative weights.
Populations of agents compete against each other in a zero-sum competition that
itself evolves adversarially to the current population mixture. Remarkably,
despite the chaotic coevolution of agents and games, we prove that the system
exhibits a number of regularities. First, the system has conservation laws of
an information-theoretic flavor that couple the behavior of all agents and
games. Secondly, the system is Poincar\'{e} recurrent, with effectively all
possible initializations of agents and games lying on recurrent orbits that
come arbitrarily close to their initial conditions infinitely often. Thirdly,
the time-average agent behavior and utility converge to the Nash equilibrium
values of the time-average game. Finally, we provide a polynomial time
algorithm to efficiently predict this time-average behavior for any such
coevolving network game.
Related papers
- Linear Convergence of Independent Natural Policy Gradient in Games with Entropy Regularization [12.612009339150504]
This work focuses on the entropy-regularized independent natural policy gradient (NPG) algorithm in multi-agent reinforcement learning.
We show that, under sufficient entropy regularization, the dynamics of this system converge at a linear rate to the quantal response equilibrium (QRE)
arXiv Detail & Related papers (2024-05-04T22:48:53Z) - Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - How Bad is Selfish Driving? Bounding the Inefficiency of Equilibria in
Urban Driving Games [64.71476526716668]
We study the (in)efficiency of any equilibrium players might agree to play.
We obtain guarantees that refine existing bounds on the Price of Anarchy.
Although the obtained guarantees concern open-loop trajectories, we observe efficient equilibria even when agents employ closed-loop policies.
arXiv Detail & Related papers (2022-10-24T09:32:40Z) - Independent Learning in Stochastic Games [16.505046191280634]
We present the model of games for multi-agent learning in dynamic environments.
We focus on the development of simple and independent learning dynamics for games.
We present our recently proposed simple and independent learning dynamics that guarantee convergence in zero-sum games.
arXiv Detail & Related papers (2021-11-23T09:27:20Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - Chaos, Extremism and Optimism: Volume Analysis of Learning in Games [55.24050445142637]
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
arXiv Detail & Related papers (2020-05-28T13:47:09Z) - No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium [76.78447814623665]
We give the first uncoupled no-regret dynamics that converge to correlated equilibria in normal-form games.
We introduce a notion of trigger regret in extensive-form games, which extends that of internal regret in normal-form games.
Our algorithm decomposes trigger regret into local subproblems at each decision point for the player, and constructs a global strategy of the player from the local solutions.
arXiv Detail & Related papers (2020-04-01T17:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.