Playing against no-regret players
- URL: http://arxiv.org/abs/2202.09364v1
- Date: Wed, 16 Feb 2022 10:12:27 GMT
- Title: Playing against no-regret players
- Authors: Maurizio D 'Andrea (ANITI, TSE)
- Abstract summary: We consider the concept of Stackelberg equilibrium in n-player games.
In our first result, we show, with counterexamples, that this result is no longer true if the game has to face more than one player.
We prove that the can guarantee at least the correlated Stackelberg value per round.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In increasingly different contexts, it happens that a human player has to
interact with artificial players who make decisions following decision-making
algorithms. How should the human player play against these algorithms to
maximize his utility? Does anything change if he faces one or more artificial
players? The main goal of the paper is to answer these two questions. Consider
n-player games in normal form repeated over time, where we call the human
player optimizer, and the (n -- 1) artificial players, learners. We assume that
learners play no-regret algorithms, a class of algorithms widely used in online
learning and decision-making. In these games, we consider the concept of
Stackelberg equilibrium. In a recent paper, Deng, Schneider, and Sivan have
shown that in a 2-player game the optimizer can always guarantee an expected
cumulative utility of at least the Stackelberg value per round. In our first
result, we show, with counterexamples, that this result is no longer true if
the optimizer has to face more than one player. Therefore, we generalize the
definition of Stackelberg equilibrium introducing the concept of correlated
Stackelberg equilibrium. Finally, in the main result, we prove that the
optimizer can guarantee at least the correlated Stackelberg value per round.
Moreover, using a version of the strong law of large numbers, we show that our
result is also true almost surely for the optimizer utility instead of the
optimizer's expected utility.
Related papers
- Is Learning in Games Good for the Learners? [14.781100349601587]
We consider tradeoffs between reward and regret in repeated gameplay between two agents.
We show that any such equilibrium is reachable by a pair of algorithms which maintain their regret guarantees against arbitrary opponents.
We also consider the question of learning reward-optimal strategies via repeated play with a no-regret agent when game is initially unknown.
arXiv Detail & Related papers (2023-05-31T02:10:27Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Strategizing against Learners in Bayesian Games [74.46970859427907]
We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy.
We consider general Bayesian games, where the payoffs of both the payoffs of both the learner and the learner could depend on the type.
arXiv Detail & Related papers (2022-05-17T18:10:25Z) - Robust No-Regret Learning in Min-Max Stackelberg Games [1.6500749121196987]
We investigate the behavior of no-regret learning in min-max games with dependent strategy sets.
We show that no-regret dynamics converge to a Stackelberg equilibrium.
We prove OMD dynamics are robust for a large class of online min-max Stackelberg games.
arXiv Detail & Related papers (2022-03-26T18:12:40Z) - No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning.
We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures.
Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z) - Public Information Representation for Adversarial Team Games [31.29335755664997]
adversarial team games reside in the asymmetric information available to the team members during the play.
Our algorithms convert a sequential team game with adversaries to a classical two-player zero-sum game.
Due to the NP-hard nature of the problem, the resulting Public Team game may be exponentially larger than the original one.
arXiv Detail & Related papers (2022-01-25T15:07:12Z) - Can Reinforcement Learning Find Stackelberg-Nash Equilibria in
General-Sum Markov Games with Myopic Followers? [156.5760265539888]
We study multi-player general-sum Markov games with one of the players designated as the leader and the other players regarded as followers.
For such a game, our goal is to find a Stackelberg-Nash equilibrium (SNE), which is a policy pair $(pi*, nu*)$.
We develop sample-efficient reinforcement learning (RL) algorithms for solving for an SNE in both online and offline settings.
arXiv Detail & Related papers (2021-12-27T05:41:14Z) - Towards convergence to Nash equilibria in two-team zero-sum games [17.4461045395989]
Two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents.
We focus on the solution concept of Nash equilibria (NE)
We show that computing NE for this class of games is $textithard$ for the complexity class $mathrm$.
arXiv Detail & Related papers (2021-11-07T21:15:35Z) - Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games [78.65798135008419]
It remains vastly open how to learn the Stackelberg equilibrium in general-sum games efficiently from samples.
This paper initiates the theoretical study of sample-efficient learning of the Stackelberg equilibrium in two-player turn-based general-sum games.
arXiv Detail & Related papers (2021-02-23T05:11:07Z) - Faster Algorithms for Optimal Ex-Ante Coordinated Collusive Strategies
in Extensive-Form Zero-Sum Games [123.76716667704625]
We focus on the problem of finding an optimal strategy for a team of two players that faces an opponent in an imperfect-information zero-sum extensive-form game.
In that setting, it is known that the best the team can do is sample a profile of potentially randomized strategies (one per player) from a joint (a.k.a. correlated) probability distribution at the beginning of the game.
We provide an algorithm that computes such an optimal distribution by only using profiles where only one of the team members gets to randomize in each profile.
arXiv Detail & Related papers (2020-09-21T17:51:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.