Complexity and Algorithms for Exploiting Quantal Opponents in Large
Two-Player Games
- URL: http://arxiv.org/abs/2009.14521v2
- Date: Wed, 16 Dec 2020 12:11:43 GMT
- Title: Complexity and Algorithms for Exploiting Quantal Opponents in Large
Two-Player Games
- Authors: David Milec, Jakub \v{C}ern\'y, Viliam Lis\'y, Bo An
- Abstract summary: Solution concepts of traditional game theory assume entirely rational players; therefore, their ability to exploit subrational opponents is limited.
This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games.
- Score: 16.43565579998679
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solution concepts of traditional game theory assume entirely rational
players; therefore, their ability to exploit subrational opponents is limited.
One type of subrationality that describes human behavior well is the quantal
response. While there exist algorithms for computing solutions against quantal
opponents, they either do not scale or may provide strategies that are even
worse than the entirely-rational Nash strategies. This paper aims to analyze
and propose scalable algorithms for computing effective and robust strategies
against a quantal opponent in normal-form and extensive-form games. Our
contributions are: (1) we define two different solution concepts related to
exploiting quantal opponents and analyze their properties; (2) we prove that
computing these solutions is computationally hard; (3) therefore, we evaluate
several heuristic approximations based on scalable counterfactual regret
minimization (CFR); and (4) we identify a CFR variant that exploits the bounded
opponents better than the previously used variants while being less exploitable
by the worst-case perfectly-rational opponent.
Related papers
- Anytime-Constrained Multi-Agent Reinforcement Learning [6.981971551979697]
We introduce anytime constraints to the multi-agent setting with the corresponding solution being anytime-constrained equilibrium (ACE)
We present a comprehensive theory of anytime-constrained Markov games, which includes a computational characterization of feasible policies.
We also first theory of efficient computation for action-constrained Markov games, which may be of independent interest.
arXiv Detail & Related papers (2024-10-31T05:07:01Z) - Imperfect-Recall Games: Equilibrium Concepts and Their Complexity [74.01381499760288]
We investigate optimal decision making under imperfect recall, that is, when an agent forgets information it once held before.
In the framework of extensive-form games with imperfect recall, we analyze the computational complexities of finding equilibria in multiplayer settings.
arXiv Detail & Related papers (2024-06-23T00:27:28Z) - Numerical solution of a PDE arising from prediction with expert advice [2.048226951354646]
This work investigates the online machine learning problem of prediction with expert advice in an adversarial setting.
The continuum limit of this game over a large number of steps is a degenerate elliptic equation whose solution encodes the optimal strategies for both players.
arXiv Detail & Related papers (2024-06-09T12:17:05Z) - Securing Equal Share: A Principled Approach for Learning Multiplayer Symmetric Games [21.168085154982712]
equilibria in multiplayer games are neither unique nor non-exploitable.
This paper takes an initial step towards addressing these challenges by focusing on the natural objective of equal share.
We design a series of efficient algorithms, inspired by no-regret learning, that provably attain approximate equal share across various settings.
arXiv Detail & Related papers (2024-06-06T15:59:17Z) - Variational Methods for Computing Non-Local Quantum Strategies [1.95414377613382]
In a nonlocal game, two noncommunicating players cooperate to convince a referee that they possess a strategy that does not violate the rules of the game.
We show that our algorithm is capable of generating a short-depth circuit that implements an optimal quantum strategy for a graph coloring game.
arXiv Detail & Related papers (2023-11-02T16:17:18Z) - Hardness of Independent Learning and Sparse Equilibrium Computation in
Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games.
We show that no algorithm attains no-regret in general-sum games when executed independently by all players.
We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z) - Model-Free Online Learning in Unknown Sequential Decision Making
Problems and Games [114.90723492840499]
In large two-player zero-sum imperfect-information games, modern extensions of counterfactual regret minimization (CFR) are currently the practical state of the art for computing a Nash equilibrium.
We formalize an online learning setting in which the strategy space is not known to the agent.
We give an efficient algorithm that achieves $O(T3/4)$ regret with high probability for that setting, even when the agent faces an adversarial environment.
arXiv Detail & Related papers (2021-03-08T04:03:24Z) - Provably Efficient Policy Gradient Methods for Two-Player Zero-Sum
Markov Games [95.70078702838654]
This paper studies natural extensions of Natural Policy Gradient algorithm for solving two-player zero-sum games.
We thoroughly characterize the algorithms' performance in terms of the number of samples, number of iterations, concentrability coefficients, and approximation error.
arXiv Detail & Related papers (2021-02-17T17:49:57Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z) - Learning to Play Sequential Games versus Unknown Opponents [93.8672371143881]
We consider a repeated sequential game between a learner, who plays first, and an opponent who responds to the chosen action.
We propose a novel algorithm for the learner when playing against an adversarial sequence of opponents.
Our results include algorithm's regret guarantees that depend on the regularity of the opponent's response.
arXiv Detail & Related papers (2020-07-10T09:33:05Z) - Efficient exploration of zero-sum stochastic games [83.28949556413717]
We investigate the increasingly important and common game-solving setting where we do not have an explicit description of the game but only oracle access to it through gameplay.
During a limited-duration learning phase, the algorithm can control the actions of both players in order to try to learn the game and how to play it well.
Our motivation is to quickly learn strategies that have low exploitability in situations where evaluating the payoffs of a queried strategy profile is costly.
arXiv Detail & Related papers (2020-02-24T20:30:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.