Related papers: Optimal Private Payoff Manipulation against Commitment in Extensive-form Games

Optimal Private Payoff Manipulation against Commitment in Extensive-form Games

URL: http://arxiv.org/abs/2206.13119v2
Date: Tue, 13 Jun 2023 11:02:24 GMT
Title: Optimal Private Payoff Manipulation against Commitment in Extensive-form Games
Authors: Yurong Chen, Xiaotie Deng, Yuhao Li
Abstract summary: We study the follower's optimal manipulation via such strategic behaviors in extensive-form games. We show that it is tractable for the follower to find the optimal way of misreporting his private payoff.
Score: 7.739432465414604
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To take advantage of strategy commitment, a useful tactic of playing games, a leader must learn enough information about the follower's payoff function. However, this leaves the follower a chance to provide fake information and influence the final game outcome. Through a carefully contrived payoff function misreported to the learning leader, the follower may induce an outcome that benefits him more, compared to the ones when he truthfully behaves. We study the follower's optimal manipulation via such strategic behaviors in extensive-form games. Followers' different attitudes are taken into account. An optimistic follower maximizes his true utility among all game outcomes that can be induced by some payoff function. A pessimistic follower only considers misreporting payoff functions that induce a unique game outcome. For all the settings considered in this paper, we characterize all the possible game outcomes that can be induced successfully. We show that it is polynomial-time tractable for the follower to find the optimal way of misreporting his private payoff information. Our work completely resolves this follower's optimal manipulation problem on an extensive-form game tree.

Related papers

Deceptive Sequential Decision-Making via Regularized Policy Optimization [54.38738815697299]
Two regularization strategies for policy synthesis problems that actively deceive an adversary about a system's underlying rewards are presented. We show how each form of deception can be implemented in policy optimization problems. We show that diversionary deception can cause the adversary to believe that the most important agent is the least important, while attaining a total accumulated reward that is $98.83%$ of its optimal, non-deceptive value.
arXiv Detail & Related papers (2025-01-30T23:41:40Z)
Decentralized Online Learning in General-Sum Stackelberg Games [2.8659922790025463]
We study an online learning problem in general-sum Stackelberg games, where players act in a decentralized and strategic manner. We show that for the follower, myopically best responding to the leader's action is the best strategy for the limited information setting. We design a new manipulation strategy for the follower in the latter setting, and show that it has an intrinsic advantage against the best response strategy.
arXiv Detail & Related papers (2024-05-06T04:35:01Z)
Learning to Manipulate a Commitment Optimizer [14.806314018261416]
Recent studies show that in a Stackelberg game the follower can manipulate the leader by deviating from their true best-response behavior. The risk these findings indicate appears to be alleviated to some extent by a strict information advantage the manipulations rely on. We consider the scenario where the follower is not given any information about the leader's payoffs to begin with but has to learn to manipulate by interacting with the leader.
arXiv Detail & Related papers (2023-02-23T07:39:37Z)
Bandit Social Learning: Exploration under Myopic Behavior [54.767961587919075]
We study social learning dynamics motivated by reviews on online platforms. Agents collectively follow a simple multi-armed bandit protocol, but each agent acts myopically, without regards to exploration. We derive stark learning failures for any such behavior, and provide matching positive results.
arXiv Detail & Related papers (2023-02-15T01:57:57Z)
ApproxED: Approximate exploitability descent via learned best responses [61.17702187957206]
We study the problem of finding an approximate Nash equilibrium of games with continuous action sets. We propose two new methods that minimize an approximation of exploitability with respect to the strategy profile.
arXiv Detail & Related papers (2023-01-20T23:55:30Z)
Commitment with Signaling under Double-sided Information Asymmetry [19.349072233281852]
This work considers a double-sided information asymmetry in a Bayesian Stackelberg game. We show that by adequately designing a signaling device that reveals partial information regarding the leader's realized action to the follower, the leader can achieve a higher expected utility than that without signaling.
arXiv Detail & Related papers (2022-12-22T01:30:54Z)
Collusion Detection in Team-Based Multiplayer Games [57.153233321515984]
We propose a system that detects colluding behaviors in team-based multiplayer games. The proposed method analyzes the players' social relationships paired with their in-game behavioral patterns. We then automate the detection using Isolation Forest, an unsupervised learning technique specialized in highlighting outliers.
arXiv Detail & Related papers (2022-03-10T02:37:39Z)
Can Reinforcement Learning Find Stackelberg-Nash Equilibria in General-Sum Markov Games with Myopic Followers? [156.5760265539888]
We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers. For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(pi*, nu*)$. We develop sample-efficient reinforcement learning (RL) algorithms for solving for an SNE in both online and offline settings.
arXiv Detail & Related papers (2021-12-27T05:41:14Z)
Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach [91.74682538906691]
Adversarial training has been shown to improve the generalization performance of deep learning models. We propose Stackelberg Adversarial Training (SALT), which formulates adversarial training as a Stackelberg game.
arXiv Detail & Related papers (2021-04-11T00:44:57Z)
Optimally Deceiving a Learning Leader in Stackelberg Games [123.14187606686006]
Recent results in the ML community have revealed that learning algorithms used to compute the optimal strategy for the leader to commit to in a Stackelberg game, are susceptible to manipulation by the follower. This paper shows that it is always possible for the follower to compute (near-optimal) payoffs for various scenarios about the learning interaction between leader and follower.
arXiv Detail & Related papers (2020-06-11T16:18:21Z)
Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay. During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well. Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.