Games played by Exponential Weights Algorithms
- URL: http://arxiv.org/abs/2407.06676v1
- Date: Tue, 9 Jul 2024 08:49:51 GMT
- Title: Games played by Exponential Weights Algorithms
- Authors: Maurizio d'Andrea, Fabien Gensbittel, Jérôme Renault,
- Abstract summary: We consider a repeated interaction in discrete time, where each player uses an exponential weights algorithm characterized by an initial mixed action and a fixed learning rate.
We show that whenever a strict Nash equilibrium exists, the probability to play a strict Nash equilibrium at the next stage converges almost surely to 0 or 1.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the last-iterate convergence properties of the exponential weights algorithm with constant learning rates. We consider a repeated interaction in discrete time, where each player uses an exponential weights algorithm characterized by an initial mixed action and a fixed learning rate, so that the mixed action profile $p^t$ played at stage $t$ follows an homogeneous Markov chain. At first, we show that whenever a strict Nash equilibrium exists, the probability to play a strict Nash equilibrium at the next stage converges almost surely to 0 or 1. Secondly, we show that the limit of $p^t$, whenever it exists, belongs to the set of ``Nash Equilibria with Equalizing Payoffs''. Thirdly, we show that in strong coordination games, where the payoff of a player is positive on the diagonal and 0 elsewhere, $p^t$ converges almost surely to one of the strict Nash equilibria. We conclude with open questions.
Related papers
- Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms [71.73971094342349]
Self-play via online learning is one of the premier ways to solve large-scale two-player zero-sum games.
We show that OMWU offers several advantages including logarithmic dependence on the size of the payoff matrix.
We prove that a broad class of algorithms that do not forget the past quickly all suffer the same issue.
arXiv Detail & Related papers (2024-06-15T13:26:17Z) - Optimistic Policy Gradient in Multi-Player Markov Games with a Single
Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller.
Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z) - Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability [11.793922711718645]
We consider decentralized learning for zero-sum games, where players only see their information and are to actions and payoffs of the opponent.
Previous works demonstrated convergence to a Nash equilibrium in this setting using double time-scale algorithms under strong reachability assumptions.
Our contribution is a rational and convergent algorithm, utilizing Tsallis-entropy regularization in a value-iteration-based algorithm.
arXiv Detail & Related papers (2023-12-13T09:31:30Z) - Smooth Nash Equilibria: Algorithms and Complexity [38.08108978808664]
A fundamental shortcoming of the concept of Nash equilibrium is its computational intractability.
In a $sigma$-smooth Nash equilibrium, players only need to achieve utility at least as high as their best deviation to a $sigma$-smooth strategy.
We show that both weak and strong $sigma$-smooth Nash equilibria have superior computational properties to Nash equilibria.
arXiv Detail & Related papers (2023-09-21T16:22:07Z) - Global Nash Equilibrium in Non-convex Multi-player Game: Theory and
Algorithms [66.8634598612777]
We show that Nash equilibrium (NE) is acceptable to all players in a multi-player game.
We also show that no one can benefit unilaterally from the general theory step by step.
arXiv Detail & Related papers (2023-01-19T11:36:50Z) - $O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in
Two-Player Zero-Sum Markov Games [10.915751393257011]
We find an $O(T-1)$-approximate Nash equilibrium in $T$ for two-player zero-sum Markov games with full information.
This improves the $tildeO(T-5/6)$ convergence rate recently shown in the paper Zhang et al (2022).
arXiv Detail & Related papers (2022-09-26T05:35:44Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Nash, Conley, and Computation: Impossibility and Incompleteness in Game
Dynamics [28.815822236291392]
We show that no game dynamics can converge to $epsilon$-Nash equilibria.
We also prove a stronger result for $epsilon$-approximate Nash equilibria.
arXiv Detail & Related papers (2022-03-26T18:27:40Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Almost Optimal Algorithms for Two-player Markov Games with Linear
Function Approximation [92.99933928528797]
We study reinforcement learning for two-player zero-sum Markov games with simultaneous moves.
We propose an algorithm Nash-UCRL-VTR based on the principle "Optimism-in-Face-of-Uncertainty"
We show that Nash-UCRL-VTR can provably achieve an $tildeO(dHsqrtT)$ regret, where $d$ is the linear function dimension.
arXiv Detail & Related papers (2021-02-15T09:09:16Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.