Decentralized Multi-Agent Reinforcement Learning for Continuous-Space
Stochastic Games
- URL: http://arxiv.org/abs/2303.13539v1
- Date: Thu, 16 Mar 2023 14:25:16 GMT
- Title: Decentralized Multi-Agent Reinforcement Learning for Continuous-Space
Stochastic Games
- Authors: Awni Altabaa, Bora Yongacoglu, Serdar Y\"uksel
- Abstract summary: We study multi-agent learning in games decentralized with general state spaces and an information structure in which agents do not observe each other's actions.
We propose a MARL algorithm and we prove the near-optimality of its policy updates.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic games are a popular framework for studying multi-agent
reinforcement learning (MARL). Recent advances in MARL have focused primarily
on games with finitely many states. In this work, we study multi-agent learning
in stochastic games with general state spaces and an information structure in
which agents do not observe each other's actions. In this context, we propose a
decentralized MARL algorithm and we prove the near-optimality of its policy
updates. Furthermore, we study the global policy-updating dynamics for a
general class of best-reply based algorithms and derive a closed-form
characterization of convergence probabilities over the joint policy space.
Related papers
- Convergence of Decentralized Actor-Critic Algorithm in General-sum Markov Games [3.8779763612314633]
We study the properties of learning algorithms in general-sum Markov games.
In particular, we focus on a decentralized algorithm where each agent adopts an actor-critic learning dynamic.
arXiv Detail & Related papers (2024-09-06T20:49:11Z) - Regularization of the policy updates for stabilizing Mean Field Games [0.2348805691644085]
This work studies non-cooperative Multi-Agent Reinforcement Learning (MARL)
MARL where multiple agents interact in the same environment and whose goal is to maximize the individual returns.
We name our algorithm Mean Field Proximal Policy Optimization (MF-PPO), and we empirically show the effectiveness of our method in the OpenSpiel framework.
arXiv Detail & Related papers (2023-04-04T05:45:42Z) - RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning [90.43925357575543]
We propose ranked policy memory ( RPM) to collect diverse multi-agent trajectories for training MARL policies with good generalizability.
RPM enables MARL agents to interact with unseen agents in multi-agent generalization evaluation scenarios and complete given tasks, and it significantly boosts the performance up to 402% on average.
arXiv Detail & Related papers (2022-10-18T07:32:43Z) - Faster Last-iterate Convergence of Policy Optimization in Zero-Sum
Markov Games [63.60117916422867]
This paper focuses on the most basic setting of competitive multi-agent RL, namely two-player zero-sum Markov games.
We propose a single-loop policy optimization method with symmetric updates from both agents, where the policy is updated via the entropy-regularized optimistic multiplicative weights update (OMWU) method.
Our convergence results improve upon the best known complexities, and lead to a better understanding of policy optimization in competitive Markov games.
arXiv Detail & Related papers (2022-10-03T16:05:43Z) - Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret
Learning in Markov Games [95.10091348976779]
We study decentralized policy learning in Markov games where we control a single agent to play with nonstationary and possibly adversarial opponents.
We propose a new algorithm, underlineDecentralized underlineOptimistic hypeunderlineRpolicy munderlineIrror deunderlineScent (DORIS)
DORIS achieves $sqrtK$-regret in the context of general function approximation, where $K$ is the number of episodes.
arXiv Detail & Related papers (2022-06-03T14:18:05Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Decentralized Cooperative Multi-Agent Reinforcement Learning with
Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams.
We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm.
We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Regularize! Don't Mix: Multi-Agent Reinforcement Learning without
Explicit Centralized Structures [8.883885464358737]
We propose using regularization for Multi-Agent Reinforcement Learning rather than learning explicit cooperative structures called em Multi-Agent Regularized Q-learning (MARQ)
Our algorithm is evaluated on several benchmark multi-agent environments and we show that MARQ consistently outperforms several baselines and state-of-the-art algorithms.
arXiv Detail & Related papers (2021-09-19T00:58:38Z) - Learning Meta Representations for Agents in Multi-Agent Reinforcement
Learning [12.170248966278281]
In multi-agent reinforcement learning, behaviors that agents learn in a single Markov Game (MG) are typically confined to the given agent number.
In this work, our focus is on creating agents that can generalize across population-varying MGs.
Instead of learning a unimodal policy, each agent learns a policy set comprising effective strategies across a variety of games.
arXiv Detail & Related papers (2021-08-30T04:30:53Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.