Multi-Agent Reinforcement Learning in Cournot Games
- URL: http://arxiv.org/abs/2009.06224v1
- Date: Mon, 14 Sep 2020 06:53:21 GMT
- Title: Multi-Agent Reinforcement Learning in Cournot Games
- Authors: Yuanyuan Shi, Baosen Zhang
- Abstract summary: We study the interaction of strategic agents in continuous action Cournot games with limited information feedback.
We consider the dynamics of the policy gradient algorithm, which is a widely adopted continuous control reinforcement learning algorithm.
- Score: 6.282068591820945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we study the interaction of strategic agents in continuous
action Cournot games with limited information feedback. Cournot game is the
essential market model for many socio-economic systems where agents learn and
compete without the full knowledge of the system or each other. We consider the
dynamics of the policy gradient algorithm, which is a widely adopted continuous
control reinforcement learning algorithm, in concave Cournot games. We prove
the convergence of policy gradient dynamics to the Nash equilibrium when the
price function is linear or the number of agents is two. This is the first
result (to the best of our knowledge) on the convergence property of learning
algorithms with continuous action spaces that do not fall in the no-regret
class.
Related papers
- Strategizing against Q-learners: A Control-theoretical Approach [1.3927943269211591]
We quantify how much strategically sophisticated agents can exploit naive Q-learners if they know the opponents' Q-learning algorithm.
We present a quantization-based approximation scheme to tackle the continuum state space.
arXiv Detail & Related papers (2024-03-13T18:54:27Z) - Stability of Multi-Agent Learning in Competitive Networks: Delaying the
Onset of Chaos [9.220952628571812]
Behaviour of multi-agent learning in competitive network games is often studied within the context of zero-sum games.
We study the Q-Learning dynamics, a popular model of exploration and exploitation in multi-agent learning.
We find that the stability of Q-Learning is explicitly dependent only on the network connectivity rather than the total number of agents.
arXiv Detail & Related papers (2023-12-19T08:41:06Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - On the Convergence of No-Regret Learning Dynamics in Time-Varying Games [89.96815099996132]
We characterize the convergence of optimistic gradient descent (OGD) in time-varying games.
Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games.
We also provide new insights on dynamic regret guarantees in static games.
arXiv Detail & Related papers (2023-01-26T17:25:45Z) - Asymptotic Convergence and Performance of Multi-Agent Q-Learning
Dynamics [38.5932141555258]
We study the dynamics of smooth Q-Learning, a popular reinforcement learning algorithm.
We show a sufficient condition on the rate of exploration such that the Q-Learning dynamics is guaranteed to converge to a unique equilibrium in any game.
arXiv Detail & Related papers (2023-01-23T18:39:11Z) - Finding mixed-strategy equilibria of continuous-action games without
gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients.
We model players' strategies using artificial neural networks.
This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z) - Exploration-Exploitation in Multi-Agent Competition: Convergence with
Bounded Rationality [21.94743452608215]
We study smooth Q-learning, a prototypical learning model that captures the balance between game rewards and exploration costs.
We show that Q-learning always converges to the unique quantal-response equilibrium (QRE), the standard solution concept for games under bounded rationality.
arXiv Detail & Related papers (2021-06-24T11:43:38Z) - Deep Latent Competition: Learning to Race Using Visual Control Policies
in Latent Space [63.57289340402389]
Deep Latent Competition (DLC) is a reinforcement learning algorithm that learns competitive visual control policies through self-play in imagination.
Imagined self-play reduces costly sample generation in the real world, while the latent representation enables planning to scale gracefully with observation dimensionality.
arXiv Detail & Related papers (2021-02-19T09:00:29Z) - Independent Policy Gradient Methods for Competitive Reinforcement
Learning [62.91197073795261]
We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents.
We show that if both players run policy gradient methods in tandem, their policies will converge to a min-max equilibrium of the game, as long as their learning rates follow a two-timescale rule.
arXiv Detail & Related papers (2021-01-11T23:20:42Z) - On Information Asymmetry in Competitive Multi-Agent Reinforcement
Learning: Convergence and Optimality [78.76529463321374]
We study the system of interacting non-cooperative two Q-learning agents.
We show that this information asymmetry can lead to a stable outcome of population learning.
arXiv Detail & Related papers (2020-10-21T11:19:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.