VISER: A Tractable Solution Concept for Games with Information Asymmetry
- URL: http://arxiv.org/abs/2307.09652v1
- Date: Tue, 18 Jul 2023 21:51:47 GMT
- Title: VISER: A Tractable Solution Concept for Games with Information Asymmetry
- Authors: Jeremy McMahan, Young Wu, Yudong Chen, Xiaojin Zhu, Qiaomin Xie
- Abstract summary: We propose a novel solution concept called VISER (Victim Is Secure, Exploiter best-Responds)
VISER enables an external observer to predict the outcome of such games.
We show that each player's VISER strategy can be computed independently in time using linear programming.
- Score: 22.29425773648108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many real-world games suffer from information asymmetry: one player is only
aware of their own payoffs while the other player has the full game
information. Examples include the critical domain of security games and
adversarial multi-agent reinforcement learning. Information asymmetry renders
traditional solution concepts such as Strong Stackelberg Equilibrium (SSE) and
Robust-Optimization Equilibrium (ROE) inoperative. We propose a novel solution
concept called VISER (Victim Is Secure, Exploiter best-Responds). VISER enables
an external observer to predict the outcome of such games. In particular, for
security applications, VISER allows the victim to better defend itself while
characterizing the most damaging attacks available to the attacker. We show
that each player's VISER strategy can be computed independently in polynomial
time using linear programming (LP). We also extend VISER to its Markov-perfect
counterpart for Markov games, which can be solved efficiently using a series of
LPs.
Related papers
- Chasing Moving Targets with Online Self-Play Reinforcement Learning for Safer Language Models [55.28518567702213]
Conventional language model (LM) safety alignment relies on a reactive, disjoint procedure: attackers exploit a static model, followed by defensive fine-tuning to patch exposed vulnerabilities.<n>This sequential approach creates a mismatch -- attackers overfit to obsolete defenses, while defenders perpetually lag behind emerging threats.<n>We propose Self-RedTeam, an online self-play reinforcement learning algorithm where an attacker and defender agent co-evolve through continuous interaction.
arXiv Detail & Related papers (2025-06-09T06:35:12Z) - DoomArena: A framework for Testing AI Agents Against Evolving Security Threats [84.94654617852322]
We present DoomArena, a security evaluation framework for AI agents.
It is a plug-in framework and integrates easily into realistic agentic frameworks.
It is modular and decouples the development of attacks from details of the environment in which the agent is deployed.
arXiv Detail & Related papers (2025-04-18T20:36:10Z) - Convex Markov Games: A Framework for Creativity, Imitation, Fairness, and Safety in Multiagent Learning [31.958202912400925]
We introduce the class of convex Markov games that allow general convex preferences over occupancy measures.
Despite infinite time horizon and strictly higher generality than Markov games, pure strategy Nash equilibria exist.
Our experiments reveal novel solutions to classic repeated normal-form games, find fair solutions in a repeated asymmetric coordination game, and prioritize safe long-term behavior in a robot warehouse environment.
arXiv Detail & Related papers (2024-10-22T00:55:04Z) - Imperfect-Recall Games: Equilibrium Concepts and Their Complexity [74.01381499760288]
We investigate optimal decision making under imperfect recall, that is, when an agent forgets information it once held before.
In the framework of extensive-form games with imperfect recall, we analyze the computational complexities of finding equilibria in multiplayer settings.
arXiv Detail & Related papers (2024-06-23T00:27:28Z) - White-box Multimodal Jailbreaks Against Large Vision-Language Models [61.97578116584653]
We propose a more comprehensive strategy that jointly attacks both text and image modalities to exploit a broader spectrum of vulnerability within Large Vision-Language Models.
Our attack method begins by optimizing an adversarial image prefix from random noise to generate diverse harmful responses in the absence of text input.
An adversarial text suffix is integrated and co-optimized with the adversarial image prefix to maximize the probability of eliciting affirmative responses to various harmful instructions.
arXiv Detail & Related papers (2024-05-28T07:13:30Z) - Optimistic Policy Gradient in Multi-Player Markov Games with a Single
Controller: Convergence Beyond the Minty Property [89.96815099996132]
We develop a new framework to characterize optimistic policy gradient methods in multi-player games with a single controller.
Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.
arXiv Detail & Related papers (2023-12-19T11:34:10Z) - Optimal Attack and Defense for Reinforcement Learning [11.36770403327493]
In adversarial RL, an external attacker has the power to manipulate the victim agent's interaction with the environment.
We show the attacker's problem of designing a stealthy attack that maximizes its own expected reward.
We argue that the optimal defense policy for the victim can be computed as the solution to a Stackelberg game.
arXiv Detail & Related papers (2023-11-30T21:21:47Z) - Multi-defender Security Games with Schedules [42.32444288821052]
Security games are often used to model strategic interactions in high-stakes security settings.
Many realistic scenarios feature multiple heterogeneous defenders with their own interests and priorities embedded in a more complex system.
We show that unlike prior work on multi-defender security games, the introduction of schedules can cause non-existence of equilibrium.
arXiv Detail & Related papers (2023-11-28T00:39:02Z) - Baseline Defenses for Adversarial Attacks Against Aligned Language
Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses.
We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training.
We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z) - Data-Scarce Identification of Game Dynamics via Sum-of-Squares Optimization [29.568222003322344]
We introduce the Side-Information Assisted Regression (SIAR) framework, designed to identify game dynamics in multiplayer normal-form games.
SIAR is solved using sum-of-squares (SOS) optimization, resulting in a hierarchy of approximations that provably converge to the true dynamics of the system.
We showcase that the SIAR framework accurately predicts player behavior across a spectrum of normal-form games, widely-known families of game dynamics, and strong benchmarks, even if the unknown system is chaotic.
arXiv Detail & Related papers (2023-07-13T09:14:48Z) - A Game-theoretic Framework for Privacy-preserving Federated Learning [46.479165992905166]
We propose the first game-theoretic framework that considers both defenders and attackers in terms of their respective payoffs.
We name this game the federated learning privacy game (FLPG), in which neither defenders nor attackers are aware of all participants' payoffs.
arXiv Detail & Related papers (2023-04-11T14:20:31Z) - Learning Correlated Stackelberg Equilibrium in General-Sum
Multi-Leader-Single-Follower Games [16.810700878778007]
We study a hierarchical multi-player game structure, where players with asymmetric roles can be separated into leaders and followers.
In particular, we focus on a Stackelberg game scenario where there are multiple leaders and a single follower.
We propose a novel asymmetric equilibrium concept for the MLSF game called Correlated Stackelberg Equilibrium (CSE)
arXiv Detail & Related papers (2022-10-22T15:05:44Z) - Fixed Points in Cyber Space: Rethinking Optimal Evasion Attacks in the
Age of AI-NIDS [70.60975663021952]
We study blackbox adversarial attacks on network classifiers.
We argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions.
We show that a continual learning approach is required to study attacker-defender dynamics.
arXiv Detail & Related papers (2021-11-23T23:42:16Z) - Adversarial Online Learning with Variable Plays in the Pursuit-Evasion
Game: Theoretical Foundations and Application in Connected and Automated
Vehicle Cybersecurity [5.9774834479750805]
We extend the adversarial/non-stochastic multi-play multi-armed bandit (MPMAB) to the case where the number of arms to play is variable.
The work is motivated by the fact that the resources allocated to scan different critical locations in an interconnected transportation system change dynamically over time and depending on the environment.
arXiv Detail & Related papers (2021-10-26T23:09:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.