Safe Search for Stackelberg Equilibria in Extensive-Form Games
- URL: http://arxiv.org/abs/2102.01775v1
- Date: Tue, 2 Feb 2021 22:01:19 GMT
- Title: Safe Search for Stackelberg Equilibria in Extensive-Form Games
- Authors: Chun Kai Ling, Noam Brown
- Abstract summary: Stackelberg equilibrium is a solution concept in two-player games where the leader has commitment rights over the follower.
We present a theoretically sound and empirically effective way to apply search to the computation of Stackelberg equilibria in general-sum games.
- Score: 24.557177222572786
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Stackelberg equilibrium is a solution concept in two-player games where the
leader has commitment rights over the follower. In recent years, it has become
a cornerstone of many security applications, including airport patrolling and
wildlife poaching prevention. Even though many of these settings are sequential
in nature, existing techniques pre-compute the entire solution ahead of time.
In this paper, we present a theoretically sound and empirically effective way
to apply search, which leverages extra online computation to improve a
solution, to the computation of Stackelberg equilibria in general-sum games.
Instead of the leader attempting to solve the full game upfront, an approximate
"blueprint" solution is first computed offline and is then improved online for
the particular subgames encountered in actual play. We prove that our search
technique is guaranteed to perform no worse than the pre-computed blueprint
strategy, and empirically demonstrate that it enables approximately solving
significantly larger games compared to purely offline methods. We also show
that our search operation may be cast as a smaller Stackelberg problem, making
our method complementary to existing algorithms based on strategy generation.
Related papers
- Who Plays First? Optimizing the Order of Play in Stackelberg Games with Many Robots [4.146913555716228]
Branch and Play (B&P) is an efficient and exact algorithm that converges to a socially optimal order of play and its Stackelberg equilibrium.
We demonstrate the practical utility of B&P to coordinate air traffic control, swarm formation, and delivery vehicle fleets.
arXiv Detail & Related papers (2024-02-14T15:34:38Z) - Game Solving with Online Fine-Tuning [17.614045403579244]
This paper investigates applying online fine-tuning while searching and proposes two methods to learn tailor-designed computations for game solving.
Our experiments show that using online fine-tuning can solve a series of challenging 7x7 Killall-Go problems, using only 23.54% of time compared to the baseline.
arXiv Detail & Related papers (2023-11-13T09:09:52Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - ApproxED: Approximate exploitability descent via learned best responses [61.17702187957206]
We study the problem of finding an approximate Nash equilibrium of games with continuous action sets.
We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile.
arXiv Detail & Related papers (2023-01-20T23:55:30Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Robust No-Regret Learning in Min-Max Stackelberg Games [1.6500749121196987]
We investigate the behavior of no-regret learning in min-max games with dependent strategy sets.
We show that no-regret dynamics converge to a Stackelberg equilibrium.
We prove OMD dynamics are robust for a large class of online min-max Stackelberg games.
arXiv Detail & Related papers (2022-03-26T18:12:40Z) - No-Regret Learning in Dynamic Stackelberg Games [31.001205916012307]
In a Stackelberg game, a leader commits to a randomized strategy, and a follower chooses their best strategy in response.
We consider an extension of a standard Stackelberg game, called a discrete-time dynamic Stackelberg game, that has an underlying state space that affects the leader's rewards and available strategies and evolves in a Markovian manner depending on both the leader and follower's selected strategies.
arXiv Detail & Related papers (2022-02-10T01:07:57Z) - Can Reinforcement Learning Find Stackelberg-Nash Equilibria in
General-Sum Markov Games with Myopic Followers? [156.5760265539888]
We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers.
For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(pi*, nu*)$.
We develop sample-efficient reinforcement learning (RL) algorithms for solving for an SNE in both online and offline settings.
arXiv Detail & Related papers (2021-12-27T05:41:14Z) - Convex-Concave Min-Max Stackelberg Games [0.0]
We introduce two first-order methods that solve a class of convex-concave min-max Stackelberg games.
We show that our methods converge in time.
We observe that the computation of competitive equilibria in Fisher markets also comprises a min-max Stackelberg game.
arXiv Detail & Related papers (2021-10-05T06:09:45Z) - Provable Fictitious Play for General Mean-Field Games [111.44976345867005]
We propose a reinforcement learning algorithm for stationary mean-field games.
The goal is to learn a pair of mean-field state and stationary policy that constitutes the Nash equilibrium.
arXiv Detail & Related papers (2020-10-08T18:46:48Z) - Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower.
This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.