Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
- URL: http://arxiv.org/abs/2102.11494v1
- Date: Tue, 23 Feb 2021 05:11:07 GMT
- Title: Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games
- Authors: Yu Bai, Chi Jin, Huan Wang, Caiming Xiong
- Abstract summary: It remains vastly open how to learn the Stackelberg equilibrium in general-sum games efficiently from samples.
This paper initiates the theoretical study of sample-efficient learning of the Stackelberg equilibrium in two-player turn-based general-sum games.
- Score: 78.65798135008419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real world applications such as economics and policy making often involve
solving multi-agent games with two unique features: (1) The agents are
inherently asymmetric and partitioned into leaders and followers; (2) The
agents have different reward functions, thus the game is general-sum. The
majority of existing results in this field focuses on either symmetric solution
concepts (e.g. Nash equilibrium) or zero-sum games. It remains vastly open how
to learn the Stackelberg equilibrium -- an asymmetric analog of the Nash
equilibrium -- in general-sum games efficiently from samples.
This paper initiates the theoretical study of sample-efficient learning of
the Stackelberg equilibrium in two-player turn-based general-sum games. We
identify a fundamental gap between the exact value of the Stackelberg
equilibrium and its estimated version using finite samples, which can not be
closed information-theoretically regardless of the algorithm. We then establish
a positive result on sample-efficient learning of Stackelberg equilibrium with
value optimal up to the gap identified above. We show that our sample
complexity is tight with matching upper and lower bounds. Finally, we extend
our learning results to the setting where the follower plays in a Markov
Decision Process (MDP), and the setting where the leader and the follower act
simultaneously.
Related papers
- Bayes correlated equilibria and no-regret dynamics [9.89901717499058]
This paper explores equilibrium concepts for Bayesian games, which are fundamental models of games with incomplete information.
We focus on communication equilibria, which can be realized by a mediator who gathers each player's private information and then sends correlated recommendations to the players.
We present an efficient algorithm for minimizing untruthful swap regret with a sublinear upper bound, which we prove to be tight up to a multiplicative constant.
arXiv Detail & Related papers (2023-04-11T06:22:51Z) - Differentiable Arbitrating in Zero-sum Markov Games [59.62061049680365]
We study how to perturb the reward in a zero-sum Markov game with two players to induce a desirable Nash equilibrium, namely arbitrating.
The lower level requires solving the Nash equilibrium under a given reward function, which makes the overall problem challenging to optimize in an end-to-end way.
We propose a backpropagation scheme that differentiates through the Nash equilibrium, which provides the gradient feedback for the upper level.
arXiv Detail & Related papers (2023-02-20T16:05:04Z) - Offline Learning in Markov Games with General Function Approximation [22.2472618685325]
We study offline multi-agent reinforcement learning (RL) in Markov games.
We provide the first framework for sample-efficient offline learning in Markov games.
arXiv Detail & Related papers (2023-02-06T05:22:27Z) - Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution
Concept over Nash Equilibria [61.093297204685264]
An effective approach in multiagent reinforcement learning is to consider the learning process of agents and influence their future policies.
This new solution concept is general such that standard solution concepts, such as a Nash equilibrium, are special cases of active equilibria.
We analyze active equilibria from a game-theoretic perspective by closely studying examples where Nash equilibria are known.
arXiv Detail & Related papers (2022-10-28T14:45:39Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Multiplayer Performative Prediction: Learning in Decision-Dependent
Games [18.386569111954213]
This paper formulates a new game theoretic framework for multi-player performative prediction.
We focus on two distinct solution concepts, namely (i) performatively stable equilibria and (ii) Nash equilibria of the game.
We show that under mild assumptions, the performatively stable equilibria can be found efficiently by a variety of algorithms.
arXiv Detail & Related papers (2022-01-10T15:31:10Z) - Finite-Sample Analysis of Decentralized Q-Learning for Stochastic Games [3.441021278275805]
Learning in games is arguably the most standard and fundamental setting in multi-agent reinforcement learning (MARL)
We establish the finite-sample complexity of fully decentralized Q-learning algorithms in a significant class of general approximation games (SGs)
We focus on the practical while challenging setting of fully decentralized MARL, where neither the rewards nor the actions of other agents can be observed by each agent.
arXiv Detail & Related papers (2021-12-15T03:33:39Z) - Learning Equilibria in Matching Markets from Bandit Feedback [139.29934476625488]
We develop a framework and algorithms for learning stable market outcomes under uncertainty.
Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.
arXiv Detail & Related papers (2021-08-19T17:59:28Z) - Learning to Compute Approximate Nash Equilibrium for Normal-form Games [15.321036952379488]
We propose a general meta learning approach to computing approximate Nash equilibrium for finite $n$-player normal-form games.
Unlike existing solutions that approximate or learn a Nash equilibrium from scratch for each of the games, our meta solver directly constructs a mapping from a game utility matrix to a joint strategy profile.
arXiv Detail & Related papers (2021-08-17T07:06:46Z) - On Information Asymmetry in Competitive Multi-Agent Reinforcement
Learning: Convergence and Optimality [78.76529463321374]
We study the system of interacting non-cooperative two Q-learning agents.
We show that this information asymmetry can lead to a stable outcome of population learning.
arXiv Detail & Related papers (2020-10-21T11:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.