Learning Rationalizable Equilibria in Multiplayer Games
- URL: http://arxiv.org/abs/2210.11402v1
- Date: Thu, 20 Oct 2022 16:49:00 GMT
- Title: Learning Rationalizable Equilibria in Multiplayer Games
- Authors: Yuanhao Wang, Dingwen Kong, Yu Bai, Chi Jin
- Abstract summary: Existing algorithms require a number of samples exponential in the number of players to learn rationalizable equilibria under bandit feedback.
This paper develops the first line of efficient algorithms for learning rationalizable Coarse Correlated Equilibria (CCE) and Correlated Equilibria (CE)
Our algorithms incorporate several novel techniques to guarantee rationalizability and no (swap-)regret simultaneously, including a correlated exploration scheme and adaptive learning rates.
- Score: 38.922957434291554
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A natural goal in multiagent learning besides finding equilibria is to learn
rationalizable behavior, where players learn to avoid iteratively dominated
actions. However, even in the basic setting of multiplayer general-sum games,
existing algorithms require a number of samples exponential in the number of
players to learn rationalizable equilibria under bandit feedback. This paper
develops the first line of efficient algorithms for learning rationalizable
Coarse Correlated Equilibria (CCE) and Correlated Equilibria (CE) whose sample
complexities are polynomial in all problem parameters including the number of
players. To achieve this result, we also develop a new efficient algorithm for
the simpler task of finding one rationalizable action profile (not necessarily
an equilibrium), whose sample complexity substantially improves over the best
existing results of Wu et al. (2021). Our algorithms incorporate several novel
techniques to guarantee rationalizability and no (swap-)regret simultaneously,
including a correlated exploration scheme and adaptive learning rates, which
may be of independent interest. We complement our results with a sample
complexity lower bound showing the sharpness of our guarantees.
Related papers
- Tractable Equilibrium Computation in Markov Games through Risk Aversion [12.980882140751895]
We show that risk-averse quantal response equilibria (RQE) become tractable to compute in all $n$-player matrix and finite-horizon Markov games.
RQE is independent of the underlying game structure and only depends on agents' degree of risk-aversion and bounded rationality.
arXiv Detail & Related papers (2024-06-20T09:53:56Z) - Provably Efficient Information-Directed Sampling Algorithms for Multi-Agent Reinforcement Learning [50.92957910121088]
This work designs and analyzes a novel set of algorithms for multi-agent reinforcement learning (MARL) based on the principle of information-directed sampling (IDS)
For episodic two-player zero-sum MGs, we present three sample-efficient algorithms for learning Nash equilibrium.
We extend Reg-MAIDS to multi-player general-sum MGs and prove that it can learn either the Nash equilibrium or coarse correlated equilibrium in a sample efficient manner.
arXiv Detail & Related papers (2024-04-30T06:48:56Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Representation Learning with Multi-Step Inverse Kinematics: An Efficient
and Optimal Approach to Rich-Observation RL [106.82295532402335]
Existing reinforcement learning algorithms suffer from computational intractability, strong statistical assumptions, and suboptimal sample complexity.
We provide the first computationally efficient algorithm that attains rate-optimal sample complexity with respect to the desired accuracy level.
Our algorithm, MusIK, combines systematic exploration with representation learning based on multi-step inverse kinematics.
arXiv Detail & Related papers (2023-04-12T14:51:47Z) - Batch Active Learning from the Perspective of Sparse Approximation [12.51958241746014]
Active learning enables efficient model training by leveraging interactions between machine learning agents and human annotators.
We study and propose a novel framework that formulates batch active learning from the sparse approximation's perspective.
Our active learning method aims to find an informative subset from the unlabeled data pool such that the corresponding training loss function approximates its full data pool counterpart.
arXiv Detail & Related papers (2022-11-01T03:20:28Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Regret Minimization and Convergence to Equilibria in General-sum Markov
Games [57.568118148036376]
We present the first algorithm for learning in general-sum Markov games that provides sublinear regret guarantees when executed by all agents.
Our algorithm is decentralized, computationally efficient, and does not require any communication between agents.
arXiv Detail & Related papers (2022-07-28T16:27:59Z) - Exploration-Exploitation in Multi-Agent Competition: Convergence with
Bounded Rationality [21.94743452608215]
We study smooth Q-learning, a prototypical learning model that captures the balance between game rewards and exploration costs.
We show that Q-learning always converges to the unique quantal-response equilibrium (QRE), the standard solution concept for games under bounded rationality.
arXiv Detail & Related papers (2021-06-24T11:43:38Z) - Hindsight and Sequential Rationality of Correlated Play [18.176128899338433]
We look at algorithms that ensure strong performance in hindsight relative to what could have been achieved with modified behavior.
We develop and advocate for this hindsight framing of learning in general sequential decision-making settings.
We present examples illustrating the distinct strengths and weaknesses of each type of equilibrium in the literature.
arXiv Detail & Related papers (2020-12-10T18:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.