Learning in Structured Stackelberg Games
- URL: http://arxiv.org/abs/2504.09006v2
- Date: Mon, 04 Aug 2025 14:51:59 GMT
- Title: Learning in Structured Stackelberg Games
- Authors: Maria-Florina Balcan, Kiriaki Fragkia, Keegan Harris,
- Abstract summary: We study structured Stackelberg games in which both players observe contextual information about the state of the world at time of play.<n>We assume a fixed relationship between the contextual information and the follower's type.<n>We show that standard learning theoretic measures of complexity do not characterize the difficulty of the leader's learning task.
- Score: 20.392732735387238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study structured Stackelberg games, in which both players (the leader and the follower) observe contextual information about the state of the world at time of play. The leader plays against one of a finite number of followers, but the follower's type is not known until after the game has ended. Importantly, we assume a fixed relationship between the contextual information and the follower's type, thereby allowing the leader to leverage this additional structure when deciding her strategy. Under this setting, we find that standard learning theoretic measures of complexity do not characterize the difficulty of the leader's learning task. Instead, we introduce a new notion of dimension, the Stackelberg-Littlestone dimension, which we show characterizes the instance-optimal regret of the leader in the online setting. Based on this, we also provide a provably optimal learning algorithm. We extend our results to the distributional setting, where we use two new notions of dimension, the $\gamma$-Stackelberg-Natarajan dimension and $\gamma$-Stackelberg-Graph dimension. We prove that these control the sample complexity lower and upper bounds respectively, and we design a simple, improper algorithm that achieves the upper bound.
Related papers
- Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information [57.287431079644705]
We study the problem of online learning in Stackelberg games with side information between a leader and a sequence of followers.<n>We provide learning algorithms for the leader which achieve $O(T1/2)$ regret under bandit feedback.
arXiv Detail & Related papers (2025-01-31T22:40:57Z) - Neural Operators Can Play Dynamic Stackelberg Games [9.058593115274336]
Dynamic Stackelberg games are a broad class of two-player games in which the leader acts first, and the follower chooses a response strategy to the leader's strategy.
This paper addresses the issue by showing that the textitfollower's best-response operator can be approximately implemented by an textitattention-based neural operator
We show that the value of the Stackelberg game where the follower uses the approximate best-response operator approximates the value of the original Stackelberg game.
arXiv Detail & Related papers (2024-11-14T18:12:06Z) - Decentralized Online Learning in General-Sum Stackelberg Games [2.8659922790025463]
We study an online learning problem in general-sum Stackelberg games, where players act in a decentralized and strategic manner.
We show that for the follower, myopically best responding to the leader's action is the best strategy for the limited information setting.
We design a new manipulation strategy for the follower in the latter setting, and show that it has an intrinsic advantage against the best response strategy.
arXiv Detail & Related papers (2024-05-06T04:35:01Z) - Online Learning in Stackelberg Games with an Omniscient Follower [83.42564921330896]
We study the problem of online learning in a two-player decentralized cooperative Stackelberg game.
In each round, the leader first takes an action, followed by the follower who takes their action after observing the leader's move.
We show that depending on the reward structure, the existence of the omniscient follower may change the sample complexity drastically.
arXiv Detail & Related papers (2023-01-27T03:35:10Z) - Adversarially Robust Learning: A Generic Minimax Optimal Learner and
Characterization [39.51923275855131]
We present a minimax optimal learner for the problem of learning predictors robust to adversarial examples at test-time.
In particular, we show, in a strong negative sense, the suboptimality of the robust learner proposed by Montasser, Hanneke, and Srebro.
arXiv Detail & Related papers (2022-09-15T15:32:42Z) - Learning in Stackelberg Games with Non-myopic Agents [60.927889817803745]
We study Stackelberg games where a principal repeatedly interacts with a non-myopic long-lived agent, without knowing the agent's payoff function.<n>We provide a general framework that reduces learning in presence of non-myopic agents to robust bandit optimization in the presence of myopic agents.
arXiv Detail & Related papers (2022-08-19T15:49:30Z) - A Characterization of Semi-Supervised Adversarially-Robust PAC Learnability [57.502573663108535]
We study the problem of learning an adversarially robust predictor to test time attacks in the semi-supervised PAC model.
We show that there is a significant benefit in semi-supervised robust learning even in the worst-case distribution-free model.
arXiv Detail & Related papers (2022-02-11T03:01:45Z) - No-Regret Learning in Dynamic Stackelberg Games [31.001205916012307]
In a Stackelberg game, a leader commits to a randomized strategy, and a follower chooses their best strategy in response.
We consider an extension of a standard Stackelberg game, called a discrete-time dynamic Stackelberg game, that has an underlying state space that affects the leader's rewards and available strategies and evolves in a Markovian manner depending on both the leader and follower's selected strategies.
arXiv Detail & Related papers (2022-02-10T01:07:57Z) - Provable Reinforcement Learning with a Short-Term Memory [68.00677878812908]
We study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length $m$.
In particular, in the rich-observation setting, we develop new algorithms using a novel "moment matching" approach with a sample complexity that scales exponentially.
Our results show that a short-term memory suffices for reinforcement learning in these environments.
arXiv Detail & Related papers (2022-02-08T16:39:57Z) - Can Reinforcement Learning Find Stackelberg-Nash Equilibria in
General-Sum Markov Games with Myopic Followers? [156.5760265539888]
We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers.
For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(pi*, nu*)$.
We develop sample-efficient reinforcement learning (RL) algorithms for solving for an SNE in both online and offline settings.
arXiv Detail & Related papers (2021-12-27T05:41:14Z) - Adversarial Training as Stackelberg Game: An Unrolled Optimization
Approach [91.74682538906691]
Adversarial training has been shown to improve the generalization performance of deep learning models.
We propose Stackelberg Adversarial Training (SALT), which formulates adversarial training as a Stackelberg game.
arXiv Detail & Related papers (2021-04-11T00:44:57Z) - When is Memorization of Irrelevant Training Data Necessary for
High-Accuracy Learning? [53.523017945443115]
We describe natural prediction problems in which every sufficiently accurate training algorithm must encode, in the prediction model, essentially all the information about a large subset of its training examples.
Our results do not depend on the training algorithm or the class of models used for learning.
arXiv Detail & Related papers (2020-12-11T15:25:14Z) - Sample-efficient proper PAC learning with approximate differential
privacy [51.09425023771246]
We prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $tilde O(d6)$, ignoring privacy and accuracy parameters.
This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of $2O(d)$ on the sample complexity.
arXiv Detail & Related papers (2020-12-07T18:17:46Z) - Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower.
This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z) - Is Long Horizon Reinforcement Learning More Difficult Than Short Horizon
Reinforcement Learning? [108.94173231481355]
Learning to plan for long horizons is a central challenge in episodic reinforcement learning problems.
We show that long horizon RL is no more difficult than short horizon RL, at least in a minimax sense.
arXiv Detail & Related papers (2020-05-01T17:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.