Related papers: Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

URL: http://arxiv.org/abs/2106.09435v3
Date: Thu, 18 Apr 2024 10:41:49 GMT
Title: Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers
Authors: Luke Marris, Paul Muller, Marc Lanctot, Karl Tuyls, Thore Graepel,
Abstract summary: We propose an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We also suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Gini Correlated Equilibrium (MGCE) We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.
Score: 21.462231105582347
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.

Related papers

Explore Reinforced: Equilibrium Approximation with Reinforcement Learning [3.214961078500366]
We introduce Exp3-IXrl - a blend of RL and game-theoretic approach, separating the RL agent's action selection from the equilibrium. We demonstrate that our algorithm expands the application of equilibrium approximation algorithms to new environments.
arXiv Detail & Related papers (2024-12-02T22:37:59Z)
Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game. Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z)
Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller. Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z)
Differentiable Arbitrating in Zero-sum Markov Games [59.62061049680365]
We study how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating. The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way. We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level.
arXiv Detail & Related papers (2023-02-20T16:05:04Z)
Safe Subgame Resolving for Extensive Form Correlated Equilibrium [47.155175336085364]
Correlated Equilibrium is a solution concept that is more general than Nash Equilibrium (NE) and can lead to better social welfare. We apply textitsubgame resolving, a technique extremely successful in finding NE in zero-sum games to solving general-sum EFCEs. Subgame resolving refines a correlation plan in an textitonline manner: instead of solving for the full game upfront, it only solves for strategies in subgames that are reached in actual play.
arXiv Detail & Related papers (2022-12-29T14:20:48Z)
Turbocharging Solution Concepts: Solving NEs, CEs and CCEs with Neural Equilibrium Solvers [22.85979978964773]
Solution concepts such as Nash Equilibria, Correlated Equilibria, and Coarse Correlated Equilibria are useful components for many multiagent machine learning algorithms. We introduce the Neural Equilibrium solver which utilizes a special equivariant neural network architecture to approximately solve the space of all games of fixed shape, buying speed and determinism.
arXiv Detail & Related papers (2022-10-17T17:00:31Z)
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games. We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method. Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z)
Pareto Actor-Critic for Equilibrium Selection in Multi-Agent Reinforcement Learning [18.20664209675016]
This work focuses on equilibrium selection in no-conflict multi-agent games. Pareto Actor-Critic (Pareto-AC) is an actor-critic algorithm that maximises the returns of all agents.
arXiv Detail & Related papers (2022-09-28T18:14:34Z)
Optimal Correlated Equilibria in General-Sum Extensive-Form Games: Fixed-Parameter Algorithms, Hardness, and Two-Sided Column-Generation [78.48747645545944]
We study the problem of finding optimal correlated equilibria of various sorts in extensive-form games. We introduce a new algorithm for computing optimal equilibria in all three notions.
arXiv Detail & Related papers (2022-03-14T15:21:18Z)
Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality [21.94743452608215]
We study smooth Q-learning, a prototypical learning model that captures the balance between game rewards and exploration costs. We show that Q-learning always converges to the unique quantal-response equilibrium (QRE), the standard solution concept for games under bounded rationality.
arXiv Detail & Related papers (2021-06-24T11:43:38Z)
Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games [78.65798135008419]
It remains vastly open how to learn the Stackelberg equilibrium in general-sum games efficiently from samples. This paper initiates the theoretical study of sample-efficient learning of the Stackelberg equilibrium in two-player turn-based general-sum games.
arXiv Detail & Related papers (2021-02-23T05:11:07Z)
Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior. We develop and advocate for this hindsight framing of learning in general sequential decision-making settings. We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.