Payoff-based learning with matrix multiplicative weights in quantum
games
- URL: http://arxiv.org/abs/2311.02423v1
- Date: Sat, 4 Nov 2023 14:56:17 GMT
- Title: Payoff-based learning with matrix multiplicative weights in quantum
games
- Authors: Kyriakos Lotidis and Panayotis Mertikopoulos and Nicholas Bambos and
Jose Blanchet
- Abstract summary: We study the problem of learning in quantum games - and other classes of semidefinite games - with scalar, payoff-based feedback.
We introduce a suite of minimal-information matrix multiplicative weights (3MW) methods tailored to different information frameworks.
We show that the 3MW method with deterministic payoff feedback retains the convergence rate of the vanilla, full information MMW algorithm in quantum min-max games.
- Score: 35.111876815522116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the problem of learning in quantum games - and other
classes of semidefinite games - with scalar, payoff-based feedback. For
concreteness, we focus on the widely used matrix multiplicative weights (MMW)
algorithm and, instead of requiring players to have full knowledge of the game
(and/or each other's chosen states), we introduce a suite of
minimal-information matrix multiplicative weights (3MW) methods tailored to
different information frameworks. The main difficulty to attaining convergence
in this setting is that, in contrast to classical finite games, quantum games
have an infinite continuum of pure states (the quantum equivalent of pure
strategies), so standard importance-weighting techniques for estimating payoff
vectors cannot be employed. Instead, we borrow ideas from bandit convex
optimization and we design a zeroth-order gradient sampler adapted to the
semidefinite geometry of the problem at hand. As a first result, we show that
the 3MW method with deterministic payoff feedback retains the
$\mathcal{O}(1/\sqrt{T})$ convergence rate of the vanilla, full information MMW
algorithm in quantum min-max games, even though the players only observe a
single scalar. Subsequently, we relax the algorithm's information requirements
even further and we provide a 3MW method that only requires players to observe
a random realization of their payoff observable, and converges to equilibrium
at an $\mathcal{O}(T^{-1/4})$ rate. Finally, going beyond zero-sum games, we
show that a regularized variant of the proposed 3MW method guarantees local
convergence with high probability to all equilibria that satisfy a certain
first-order stability condition.
Related papers
- $\mathcal{PT}$-symmetric mapping of three states and its implementation on a cloud quantum processor [0.9599644507730107]
We develop a new $mathcalPT$-symmetric approach for mapping three pure qubit states.
We show consistency with the Hermitian case, conservation of average projections on reference vectors, and Quantum Fisher Information.
Our work unlocks new doors for applying $mathcalPT$-symmetry in quantum communication, computing, and cryptography.
arXiv Detail & Related papers (2023-12-27T18:51:33Z) - Scalable and Independent Learning of Nash Equilibrium Policies in
$n$-Player Stochastic Games with Unknown Independent Chains [1.0878040851638]
We study games with independent chains and unknown transition matrices.
In this class of games, players control their own internal Markov chains whose transitions do not depend on the states/actions of other players.
We propose a fully decentralized mirror descent algorithm to learn an $epsilon$-NE policy.
arXiv Detail & Related papers (2023-12-04T03:04:09Z) - A Quadratic Speedup in Finding Nash Equilibria of Quantum Zero-Sum Games [102.46640028830441]
We introduce the Optimistic Matrix Multiplicative Weights Update (OMMWU) algorithm and establish its average-iterate convergence complexity as $mathcalO(d/epsilon)$ to $epsilon$-Nash equilibria.
This quadratic speed-up sets a new benchmark for computing $epsilon$-Nash equilibria in quantum zero-sum games.
arXiv Detail & Related papers (2023-11-17T20:38:38Z) - Global Nash Equilibrium in Non-convex Multi-player Game: Theory and
Algorithms [66.8634598612777]
We show that Nash equilibrium (NE) is acceptable to all players in a multi-player game.
We also show that no one can benefit unilaterally from the general theory step by step.
arXiv Detail & Related papers (2023-01-19T11:36:50Z) - Learning in Multi-Player Stochastic Games [1.0878040851638]
We consider the problem of simultaneous learning in games with many players in the finite-horizon setting.
While the typical target solution for a game is a Nash equilibrium, this is intractable with many players.
We turn to a different target: algorithms which generate an equilibrium when they are used by all players.
arXiv Detail & Related papers (2022-10-25T19:02:03Z) - Learning Correlated Equilibria in Mean-Field Games [62.14589406821103]
We develop the concepts of Mean-Field correlated and coarse-correlated equilibria.
We show that they can be efficiently learnt in emphall games, without requiring any additional assumption on the structure of the game.
arXiv Detail & Related papers (2022-08-22T08:31:46Z) - Chaos, Extremism and Optimism: Volume Analysis of Learning in Games [55.24050445142637]
We present volume analyses of Multiplicative Weights Updates (MWU) and Optimistic Multiplicative Weights Updates (OMWU) in zero-sum as well as coordination games.
We show that OMWU contracts volume, providing an alternative understanding for its known convergent behavior.
We also prove a no-free-lunch type of theorem, in the sense that when examining coordination games the roles are reversed: OMWU expands volume exponentially fast, whereas MWU contracts.
arXiv Detail & Related papers (2020-05-28T13:47:09Z) - Learning Zero-Sum Simultaneous-Move Markov Games Using Function
Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games.
In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap.
In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.