Decentralized Online Learning in General-Sum Stackelberg Games
- URL: http://arxiv.org/abs/2405.03158v1
- Date: Mon, 6 May 2024 04:35:01 GMT
- Title: Decentralized Online Learning in General-Sum Stackelberg Games
- Authors: Yaolong Yu, Haipeng Chen,
- Abstract summary: We study an online learning problem in general-sum Stackelberg games, where players act in a decentralized and strategic manner.
We show that for the follower, myopically best responding to the leader's action is the best strategy for the limited information setting.
We design a new manipulation strategy for the follower in the latter setting, and show that it has an intrinsic advantage against the best response strategy.
- Score: 2.8659922790025463
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study an online learning problem in general-sum Stackelberg games, where players act in a decentralized and strategic manner. We study two settings depending on the type of information for the follower: (1) the limited information setting where the follower only observes its own reward, and (2) the side information setting where the follower has extra side information about the leader's reward. We show that for the follower, myopically best responding to the leader's action is the best strategy for the limited information setting, but not necessarily so for the side information setting -- the follower can manipulate the leader's reward signals with strategic actions, and hence induce the leader's strategy to converge to an equilibrium that is better off for itself. Based on these insights, we study decentralized online learning for both players in the two settings. Our main contribution is to derive last-iterate convergence and sample complexity results in both settings. Notably, we design a new manipulation strategy for the follower in the latter setting, and show that it has an intrinsic advantage against the best response strategy. Our theories are also supported by empirical results.
Related papers
- Learnability Gaps of Strategic Classification [68.726857356532]
We focus on addressing a fundamental question: the learnability gaps between strategic classification and standard learning.
We provide nearly tight sample complexity and regret bounds, offering significant improvements over prior results.
Notably, our algorithm in this setting is of independent interest and can be applied to other problems such as multi-label learning.
arXiv Detail & Related papers (2024-02-29T16:09:19Z) - Regret Minimization in Stackelberg Games with Side Information [44.72865997906019]
We formalize settings for Stackelberg games in which both players observe an external context before playing.
In sharp contrast to the non-context version, we show that it is impossible for the leader to achieve no-regret in the full adversarial setting.
Motivated by this result, we show that no-regret learning is possible in two natural relaxations.
arXiv Detail & Related papers (2024-02-13T16:24:57Z) - When Should a Leader Act Suboptimally? The Role of Inferability in Repeated Stackelberg Games [28.856644679990357]
We model the inferability problem using Stackelberg games with observations where a leader and a follower repeatedly interact.
For a variety of game settings, we show that the inferability gap is upper-bounded by a function of the number of interactions and theity level of the leader's strategy.
We identify a set of games where the leader's near-optimal strategy may suffer from a large inferability gap.
arXiv Detail & Related papers (2023-09-30T19:08:05Z) - Learning to Manipulate a Commitment Optimizer [14.806314018261416]
Recent studies show that in a Stackelberg game the follower can manipulate the leader by deviating from their true best-response behavior.
The risk these findings indicate appears to be alleviated to some extent by a strict information advantage the manipulations rely on.
We consider the scenario where the follower is not given any information about the leader's payoffs to begin with but has to learn to manipulate by interacting with the leader.
arXiv Detail & Related papers (2023-02-23T07:39:37Z) - Online Learning in Stackelberg Games with an Omniscient Follower [83.42564921330896]
We study the problem of online learning in a two-player decentralized cooperative Stackelberg game.
In each round, the leader first takes an action, followed by the follower who takes their action after observing the leader's move.
We show that depending on the reward structure, the existence of the omniscient follower may change the sample complexity drastically.
arXiv Detail & Related papers (2023-01-27T03:35:10Z) - Who Leads and Who Follows in Strategic Classification? [82.44386576129295]
We argue that the order of play in strategic classification is fundamentally determined by the relative frequencies at which the decision-maker and the agents adapt to each other's actions.
We show that a decision-maker with the freedom to choose their update frequency can induce learning dynamics that converge to Stackelberg equilibria with either order of play.
arXiv Detail & Related papers (2021-06-23T16:48:46Z) - Adversarial Training as Stackelberg Game: An Unrolled Optimization
Approach [91.74682538906691]
Adversarial training has been shown to improve the generalization performance of deep learning models.
We propose Stackelberg Adversarial Training (SALT), which formulates adversarial training as a Stackelberg game.
arXiv Detail & Related papers (2021-04-11T00:44:57Z) - Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower.
This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.