Complexity and Algorithms for Exploiting Quantal Opponents in Large
Two-Player Games
- URL: http://arxiv.org/abs/2009.14521v2
- Date: Wed, 16 Dec 2020 12:11:43 GMT
- Title: Complexity and Algorithms for Exploiting Quantal Opponents in Large
Two-Player Games
- Authors: David Milec, Jakub \v{C}ern\'y, Viliam Lis\'y, Bo An
- Abstract summary: Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited.
This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games.
- Score: 16.43565579998679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solution concepts of traditional game theory assume entirely rational
players; therefore, their ability to exploit subrational opponents is limited.
One type of subrationality that describes human behavior well is the quantal
response. While there exist algorithms for computing solutions against quantal
opponents, they either do not scale or may provide strategies that are even
worse than the entirely-rational Nash strategies. This paper aims to analyze
and propose scalable algorithms for computing effective and robust strategies
against a quantal opponent in normal-form and extensive-form games. Our
contributions are: (1) we define two different solution concepts related to
exploiting quantal opponents and analyze their properties; (2) we prove that
computing these solutions is computationally hard; (3) therefore, we evaluate
several heuristic approximations based on scalable counterfactual regret
minimization (CFR); and (4) we identify a CFR variant that exploits the bounded
opponents better than the previously used variants while being less exploitable
by the worst-case perfectly-rational opponent.
Related papers
- Imperfect-Recall Games: Equilibrium Concepts and Their Complexity [74.01381499760288]
We investigate optimal decision making under imperfect recall, that is, when an agent forgets information it once held before.
In the framework of extensive-form games with imperfect recall, we analyze the computational complexities of finding equilibria in multiplayer settings.
arXiv Detail & Related papers (2024-06-23T00:27:28Z) - Numerical solution of a PDE arising from prediction with expert advice [2.048226951354646]
This work investigates the online machine learning problem of prediction with expert advice in an adversarial setting.
The continuum limit of this game over a large number of steps is a degenerate elliptic equation whose solution encodes the optimal strategies for both players.
arXiv Detail & Related papers (2024-06-09T12:17:05Z) - Variational Methods for Computing Non-Local Quantum Strategies [1.95414377613382]
In a nonlocal game, two noncommunicating players cooperate to convince a referee that they possess a strategy that does not violate the rules of the game.
We show that our algorithm is capable of generating a short-depth circuit that implements an optimal quantum strategy for a graph coloring game.
arXiv Detail & Related papers (2023-11-02T16:17:18Z) - Robust and Space-Efficient Dual Adversary Quantum Query Algorithms [0.0]
General adversary dual is a powerful tool in quantum computing.
It gives a query-optimal bounded-error quantumlemma deciding any Boolean function.
We develop a robust dual adversary algorithm that can handle approximately satisfied constraints.
arXiv Detail & Related papers (2023-06-26T19:59:30Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - Bayesian Opponent Modeling in Multiplayer Imperfect-Information Games [1.4502611532302037]
We present an approach for opponent modeling in multiplayer imperfect-information games.
We run experiments against a variety of real opponents and exact Nash equilibrium strategies in three-player Kuhn poker.
Our algorithm significantly outperforms all of the agents, including the exact Nash equilibrium strategies.
arXiv Detail & Related papers (2022-12-12T16:48:53Z) - Model-Free Online Learning in Unknown Sequential Decision Making
Problems and Games [114.90723492840499]
In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are currently the practical state of the art for computing a Nash equilibrium.
We formalize an online learning setting in which the strategy space is not known to the agent.
We give an efficient algorithm that achieves $O(T3/4)$ regret with high probability for that setting, even when the agent faces an adversarial environment.
arXiv Detail & Related papers (2021-03-08T04:03:24Z) - Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum
Markov Games [95.70078702838654]
This paper studies natural extensions of Natural Policy Gradient algorithm for solving two-player zero-sum games.
We thoroughly characterize the algorithms' performance in terms of the number of samples, number of iterations, concentrability coefficients, and approximation error.
arXiv Detail & Related papers (2021-02-17T17:49:57Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.