Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics
- URL: http://arxiv.org/abs/2502.14724v1
- Date: Thu, 20 Feb 2025 16:50:38 GMT
- Title: Ranking Joint Policies in Dynamic Games using Evolutionary Dynamics
- Authors: Natalia Koliou, George Vouros,
- Abstract summary: It has been shown that the dynamics of agents' interactions, even in simple two-player games, are incapable of reaching Nash equilibria.
Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs.
- Score: 0.0
- License:
- Abstract: Game-theoretic solution concepts, such as the Nash equilibrium, have been key to finding stable joint actions in multi-player games. However, it has been shown that the dynamics of agents' interactions, even in simple two-player games with few strategies, are incapable of reaching Nash equilibria, exhibiting complex and unpredictable behavior. Instead, evolutionary approaches can describe the long-term persistence of strategies and filter out transient ones, accounting for the long-term dynamics of agents' interactions. Our goal is to identify agents' joint strategies that result in stable behavior, being resistant to changes, while also accounting for agents' payoffs, in dynamic games. Towards this goal, and building on previous results, this paper proposes transforming dynamic games into their empirical forms by considering agents' strategies instead of agents' actions, and applying the evolutionary methodology $\alpha$-Rank to evaluate and rank strategy profiles according to their long-term dynamics. This methodology not only allows us to identify joint strategies that are strong through agents' long-term interactions, but also provides a descriptive, transparent framework regarding the high ranking of these strategies. Experiments report on agents that aim to collaboratively solve a stochastic version of the graph coloring problem. We consider different styles of play as strategies to define the empirical game, and train policies realizing these strategies, using the DQN algorithm. Then we run simulations to generate the payoff matrix required by $\alpha$-Rank to rank joint strategies.
Related papers
- Preference-based opponent shaping in differentiable games [3.373994463906893]
We propose a novel Preference-based Opponent Shaping (PBOS) method to enhance the strategy learning process by shaping agents' preferences towards cooperation.
We verify the performance of PBOS algorithm in a variety of differentiable games.
arXiv Detail & Related papers (2024-12-04T06:49:21Z) - Paths to Equilibrium in Games [6.812247730094933]
We study sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning.
Our analysis reveals a counterintuitive insight that reward deteriorating strategic updates are key to driving play to equilibrium along a satisficing path.
arXiv Detail & Related papers (2024-03-26T19:58:39Z) - Fast Peer Adaptation with Context-aware Exploration [63.08444527039578]
We propose a peer identification reward for learning agents in multi-agent games.
This reward motivates the agent to learn a context-aware policy for effective exploration and fast adaptation.
We evaluate our method on diverse testbeds that involve competitive (Kuhn Poker), cooperative (PO-Overcooked), or mixed (Predator-Prey-W) games with peer agents.
arXiv Detail & Related papers (2024-02-04T13:02:27Z) - Opponent Modeling in Multiplayer Imperfect-Information Games [1.024113475677323]
We present an approach for opponent modeling in multiplayer imperfect-information games.
We run experiments against a variety of real opponents and exact Nash equilibrium strategies in three-player Kuhn poker.
Our algorithm significantly outperforms all of the agents, including the exact Nash equilibrium strategies.
arXiv Detail & Related papers (2022-12-12T16:48:53Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Game Theoretic Rating in N-player general-sum games with Equilibria [26.166859475522106]
We propose novel algorithms suitable for N-player, general-sum rating of strategies in normal-form games according to the payoff rating system.
This enables well-established solution concepts, such as equilibria, to be leveraged to efficiently rate strategies in games with complex strategic interactions.
arXiv Detail & Related papers (2022-10-05T12:33:03Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - Pick Your Battles: Interaction Graphs as Population-Level Objectives for
Strategic Diversity [49.68758494467258]
We study how to construct diverse populations of agents by carefully structuring how individuals within a population interact.
Our approach is based on interaction graphs, which control the flow of information between agents during training.
We provide evidence for the importance of diversity in multi-agent training and analyse the effect of applying different interaction graphs on the training trajectories, diversity and performance of populations in a range of games.
arXiv Detail & Related papers (2021-10-08T11:29:52Z) - Policy Fusion for Adaptive and Customizable Reinforcement Learning
Agents [137.86426963572214]
We show how to combine distinct behavioral policies to obtain a meaningful "fusion" policy.
We propose four different policy fusion methods for combining pre-trained policies.
We provide several practical examples and use-cases for how these methods are indeed useful for video game production and designers.
arXiv Detail & Related papers (2021-04-21T16:08:44Z) - Moody Learners -- Explaining Competitive Behaviour of Reinforcement
Learning Agents [65.2200847818153]
In a competitive scenario, the agent does not only have a dynamic environment but also is directly affected by the opponents' actions.
Observing the Q-values of the agent is usually a way of explaining its behavior, however, do not show the temporal-relation between the selected actions.
arXiv Detail & Related papers (2020-07-30T11:30:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.