Related papers: Calibrated Stackelberg Games: Learning Optimal Commitments Against Calibrated Agents

Calibrated Stackelberg Games: Learning Optimal Commitments Against Calibrated Agents

URL: http://arxiv.org/abs/2306.02704v1
Date: Mon, 5 Jun 2023 08:55:50 GMT
Title: Calibrated Stackelberg Games: Learning Optimal Commitments Against Calibrated Agents
Authors: Nika Haghtalab, Chara Podimata, Kunhe Yang
Abstract summary: Calibrated Stackelberg Games (CSGs) is a new type of Stackelberg Games (SGs) In CSGs, a principal repeatedly interacts with an agent who (contrary to standard SGs) does not have direct access to the principal's action but instead best-responds to calibrated forecasts about it. We show that in CSGs, the principal can achieve utility that converges to the optimum Stackelberg value of the game both in finite and continuous settings.
Score: 15.145023509806977
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we introduce a generalization of the standard Stackelberg Games (SGs) framework: Calibrated Stackelberg Games (CSGs). In CSGs, a principal repeatedly interacts with an agent who (contrary to standard SGs) does not have direct access to the principal's action but instead best-responds to calibrated forecasts about it. CSG is a powerful modeling tool that goes beyond assuming that agents use ad hoc and highly specified algorithms for interacting in strategic settings and thus more robustly addresses real-life applications that SGs were originally intended to capture. Along with CSGs, we also introduce a stronger notion of calibration, termed adaptive calibration, that provides fine-grained any-time calibration guarantees against adversarial sequences. We give a general approach for obtaining adaptive calibration algorithms and specialize them for finite CSGs. In our main technical result, we show that in CSGs, the principal can achieve utility that converges to the optimum Stackelberg value of the game both in finite and continuous settings, and that no higher utility is achievable. Two prominent and immediate applications of our results are the settings of learning in Stackelberg Security Games and strategic classification, both against calibrated agents.

Related papers

Convex Markov Games and Beyond: New Proof of Existence, Characterization and Learning Algorithms for Nash Equilibria [20.875347023588652]
General Utility Markov Games (GUMGs) capture new applications requiring coupling between agents' occupancy measures.<n>We prove that Nash equilibria coincide with the fixed points of projected pseudo-gradient dynamics (i.e., first-order stationary points) enabled by a novel agent-wise gradient domination property.<n>Building on this characterization, we establish a policy gradient theorem for GUMGs and design a model-free policy gradient algorithm.
arXiv Detail & Related papers (2026-02-12T17:11:20Z)
Robust Verification of Concurrent Stochastic Games [3.2964666213105587]
We introduce *robust CSGs* and their subclass *interval CSGs* (ICSGs)<n>We propose a novel framework for *robust* verification of these models under worst-case assumptions about transition uncertainty.<n>We build an implementation in the PRISMgames model checker and demonstrate the feasibility of robust verification of ICSGs across a selection of large benchmarks.
arXiv Detail & Related papers (2026-01-17T10:42:44Z)
C$^2$GSPG: Confidence-calibrated Group Sequence Policy Gradient towards Self-aware Reasoning [54.705168477975384]
Group Sequence Policy Gradient (GSPG) framework for learning reasoning models.<n>C$2$GSPG simultaneously enhances reasoning performance while suppressing overconfidence.
arXiv Detail & Related papers (2025-09-27T05:24:51Z)
Pose-Robust Calibration Strategy for Point-of-Gaze Estimation on Mobile Phones [52.87468614536999]
We analyze how the diversity of calibration points and head poses influences estimation accuracy.<n>Experiments show that introducing a wider range of head poses during calibration improves the estimator's ability to handle pose variation.<n>We propose a dynamic calibration strategy in which users fixate on calibration points while moving their phones.
arXiv Detail & Related papers (2025-08-14T01:28:30Z)
Adaptive Set-Mass Calibration with Conformal Prediction [60.47079469141295]
We develop a new calibration procedure that starts with conformal prediction to obtain a set of labels that gives the desired coverage.<n>We then instantiate two simple post-hoc calibrators: a mass normalization and a temperature scaling-based rule, tuned to the conformal constraint.
arXiv Detail & Related papers (2025-05-21T12:18:15Z)
On the Convergence of DP-SGD with Adaptive Clipping [56.24689348875711]
Gradient Descent with gradient clipping is a powerful technique for enabling differentially private optimization. This paper provides the first comprehensive convergence analysis of SGD with quantile clipping (QC-SGD) We show how QC-SGD suffers from a bias problem similar to constant-threshold clipped SGD but can be mitigated through a carefully designed quantile and step size schedule.
arXiv Detail & Related papers (2024-12-27T20:29:47Z)
Meta SAC-Lag: Towards Deployable Safe Reinforcement Learning via MetaGradient-based Hyperparameter Tuning [2.7898966850590625]
Safe Reinforcement Learning (Safe RL) is one of the prevalently studied subcategories of trial-and-error-based methods. We propose a unified Lagrangian-based model-free architecture called Meta Soft Actor-Critic Lagrangian (Meta SAC-Lag) Our results show that the agent can reliably adjust the safety performance due to the relatively fast convergence rate of the safety threshold.
arXiv Detail & Related papers (2024-08-15T06:18:50Z)
Aligning GPTRec with Beyond-Accuracy Goals with Reinforcement Learning [67.71952251641545]
GPTRec is an alternative to the Top-K model for item-by-item recommendations. We show that GPTRec offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques. Our experiments on two datasets show that GPTRec's Next-K generation approach offers a better tradeoff between accuracy and secondary metrics than classic greedy re-ranking techniques.
arXiv Detail & Related papers (2024-03-07T19:47:48Z)
Curvature-Informed SGD via General Purpose Lie-Group Preconditioners [6.760212042305871]
We present a novel approach to accelerate gradient descent (SGD) by utilizing curvature information. Our approach involves two preconditioners: a matrix-free preconditioner and a low-rank approximation preconditioner. We demonstrate that Preconditioned SGD (PSGD) outperforms SoTA on Vision, NLP, and RL tasks.
arXiv Detail & Related papers (2024-02-07T03:18:00Z)
Sharpness-Aware Gradient Matching for Domain Generalization [84.14789746460197]
The goal of domain generalization (DG) is to enhance the generalization capability of the model learned from a source domain to other unseen domains. The recently developed Sharpness-Aware Minimization (SAM) method aims to achieve this goal by minimizing the sharpness measure of the loss landscape. We present two conditions to ensure that the model could converge to a flat minimum with a small loss, and present an algorithm, named Sharpness-Aware Gradient Matching (SAGM) Our proposed SAGM method consistently outperforms the state-of-the-art methods on five DG benchmarks.
arXiv Detail & Related papers (2023-03-18T07:25:12Z)
Pretraining Without Attention [114.99187017618408]
This work explores pretraining without attention by using recent advances in sequence routing based on state-space models (SSMs) BiGS is able to match BERT pretraining accuracy on GLUE and can be extended to long-form pretraining of 4096 tokens without approximation.
arXiv Detail & Related papers (2022-12-20T18:50:08Z)
Learning in Stackelberg Games with Non-myopic Agents [60.927889817803745]
We study Stackelberg games where a principal repeatedly interacts with a non-myopic long-lived agent, without knowing the agent's payoff function. We provide a general framework that reduces learning in presence of non-myopic agents to robust bandit optimization in the presence of myopic agents.
arXiv Detail & Related papers (2022-08-19T15:49:30Z)
Evolutionary Approach to Security Games with Signaling [40.79980131949599]
Green Security Games have become a popular way to model scenarios involving the protection of natural resources, such as wildlife. Sensors equipped with cameras have also begun to play a role in these scenarios by providing real-time information. We propose a novel approach to Security Games with Signaling (SGS), which employs an Evolutionary Computation paradigm: EASGS. EASGS effectively searches the huge SGS solution space via suitable solution encoding in a chromosome and a specially-designed set of operators.
arXiv Detail & Related papers (2022-04-29T15:56:47Z)
Stabilizing Spiking Neuron Training [3.335932527835653]
spiking Neuromorphic Computing uses binary activity to improve Artificial Intelligence energy efficiency. It remains unclear how to determine the best SG for a given task and network. We show how it can be used to reduce the need for extensive grid-search of dampening, sharpness and tail-fatness of the SG.
arXiv Detail & Related papers (2022-02-01T09:10:57Z)
Bayesian decision-making under misspecified priors with applications to meta-learning [64.38020203019013]
Thompson sampling and other sequential decision-making algorithms are popular approaches to tackle explore/exploit trade-offs in contextual bandits. We show that performance degrades gracefully with misspecified priors.
arXiv Detail & Related papers (2021-07-03T23:17:26Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
LASG: Lazily Aggregated Stochastic Gradients for Communication-Efficient Distributed Learning [47.93365664380274]
This paper targets solving distributed machine learning problems such as federated learning in a communication-efficient fashion. A class of new gradient descent (SGD) approaches have been developed, which can be viewed as a generalization to the recently developed lazily aggregated gradient (LAG) method. The key components of LASG are a set of new rules tailored for gradients that can be implemented either to save download, upload, or both.
arXiv Detail & Related papers (2020-02-26T08:58:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.