Related papers: Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks

Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks

URL: http://arxiv.org/abs/2211.15936v1
Date: Tue, 29 Nov 2022 05:16:41 GMT
Title: Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks
Authors: Carlos Martin, Tuomas Sandholm
Abstract summary: We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. We model players' strategies using artificial neural networks. This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
Score: 83.28949556413717
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. Such game access is common in reinforcement learning settings, where the environment is typically treated as a black box. To tackle this problem, we apply zeroth-order optimization techniques that combine smoothed gradient estimators with equilibrium-finding dynamics. We model players' strategies using artificial neural networks. In particular, we use randomized policy networks to model mixed strategies. These take noise in addition to an observation as input and can flexibly represent arbitrary observation-dependent, continuous-action distributions. Being able to model such mixed strategies is crucial for tackling continuous-action games that lack pure-strategy equilibria. We evaluate the performance of our method using an approximation of the Nash convergence metric from game theory, which measures how much players can benefit from unilaterally changing their strategy. We apply our method to continuous Colonel Blotto games, single-item and multi-item auctions, and a visibility game. The experiments show that our method can quickly find high-quality approximate equilibria. Furthermore, they show that the dimensionality of the input noise is crucial for performance. To our knowledge, this paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.

Related papers

Meta-Learning in Self-Play Regret Minimization [10.843705580746397]
We present a general approach to online optimization which plays a crucial role in many algorithms for approximating Nash equilibria in two-player zero-sum games. We build upon this, extending the framework to the more challenging self-play setting, which is the basis for most state-of-the-art equilibrium approximation algorithms. Our meta-learned algorithms considerably outperform other state-of-the-art regret minimization algorithms.
arXiv Detail & Related papers (2025-04-26T13:27:24Z)
Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics [0.0]
It has been shown that the dynamics of agents' interactions, even in simple two-player games, are incapable of reaching Nash equilibria. Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs.
arXiv Detail & Related papers (2025-02-20T16:50:38Z)
Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z)
The equivalence of dynamic and strategic stability under regularized learning in games [33.74394172275373]
We examine the long-run behavior of regularized, no-regret learning in finite games. We obtain an equivalence between strategic and dynamic stability. We show that methods based on entropic regularization converge at a geometric rate.
arXiv Detail & Related papers (2023-11-04T14:07:33Z)
On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games. We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z)
ApproxED: Approximate exploitability descent via learned best responses [61.17702187957206]
We study the problem of finding an approximate Nash equilibrium of games with continuous action sets. We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile.
arXiv Detail & Related papers (2023-01-20T23:55:30Z)
Opponent Modeling in Multiplayer Imperfect-Information Games [1.024113475677323]
We present an approach for opponent modeling in multiplayer imperfect-information games. We run experiments against a variety of real opponents and exact Nash equilibrium strategies in three-player Kuhn poker. Our algorithm significantly outperforms all of the agents, including the exact Nash equilibrium strategies.
arXiv Detail & Related papers (2022-12-12T16:48:53Z)
An Exponentially Converging Particle Method for the Mixed Nash Equilibrium of Continuous Games [0.0]
We consider the problem of computing mixed Nash equilibria of two-player zero-sum games with continuous sets of pure strategies and with first-order access to the payoff function. This problem arises for example in game-inspired machine learning applications, such as distributionally-robust learning. We introduce and analyze a particle-based method that enjoys guaranteed local convergence for this problem.
arXiv Detail & Related papers (2022-11-02T17:03:40Z)
A unified stochastic approximation framework for learning in games [82.74514886461257]
We develop a flexible approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite) The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, exponential/multiplicative weights for learning in finite games, optimistic and bandit variants of the above, etc.
arXiv Detail & Related papers (2022-06-08T14:30:38Z)
Multiplayer Performative Prediction: Learning in Decision-Dependent Games [18.386569111954213]
This paper formulates a new game theoretic framework for multi-player performative prediction. We focus on two distinct solution concepts, namely (i) performatively stable equilibria and (ii) Nash equilibria of the game. We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms.
arXiv Detail & Related papers (2022-01-10T15:31:10Z)
Efficient Competitive Self-Play Policy Optimization [20.023522000925094]
We propose a new algorithmic framework for competitive self-play reinforcement learning in two-player zero-sum games. Our method trains several agents simultaneously, and intelligently takes each other as opponent based on simple adversarial rules. We prove theoretically that our algorithm converges to an approximate equilibrium with high probability in convex-concave games.
arXiv Detail & Related papers (2020-09-13T21:01:38Z)
Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.