Graph Convolutional Value Decomposition in Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2010.04740v2
- Date: Wed, 10 Feb 2021 07:33:31 GMT
- Title: Graph Convolutional Value Decomposition in Multi-Agent Reinforcement
Learning
- Authors: Navid Naderializadeh, Fan H. Hung, Sean Soleyman, Deepak Khosla
- Abstract summary: We propose a novel framework for value function factorization in deep reinforcement learning.
In particular, we consider the team of agents as the set of nodes of a complete directed graph.
We introduce a mixing GNN module, which is responsible for i) factorizing the team state-action value function into individual per-agent observation-action value functions, and ii) explicit credit assignment to each agent in terms of fractions of the global team reward.
- Score: 9.774412108791218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel framework for value function factorization in multi-agent
deep reinforcement learning (MARL) using graph neural networks (GNNs). In
particular, we consider the team of agents as the set of nodes of a complete
directed graph, whose edge weights are governed by an attention mechanism.
Building upon this underlying graph, we introduce a mixing GNN module, which is
responsible for i) factorizing the team state-action value function into
individual per-agent observation-action value functions, and ii) explicit
credit assignment to each agent in terms of fractions of the global team
reward. Our approach, which we call GraphMIX, follows the centralized training
and decentralized execution paradigm, enabling the agents to make their
decisions independently once training is completed. We show the superiority of
GraphMIX as compared to the state-of-the-art on several scenarios in the
StarCraft II multi-agent challenge (SMAC) benchmark. We further demonstrate how
GraphMIX can be used in conjunction with a recent hierarchical MARL
architecture to both improve the agents' performance and enable fine-tuning
them on mismatched test scenarios with higher numbers of agents and/or actions.
Related papers
- Hi-GMAE: Hierarchical Graph Masked Autoencoders [90.30572554544385]
Hierarchical Graph Masked AutoEncoders (Hi-GMAE)
Hi-GMAE is a novel multi-scale GMAE framework designed to handle the hierarchical structures within graphs.
Our experiments on 15 graph datasets consistently demonstrate that Hi-GMAE outperforms 17 state-of-the-art self-supervised competitors.
arXiv Detail & Related papers (2024-05-17T09:08:37Z) - Mastering Complex Coordination through Attention-based Dynamic Graph [14.855793715829954]
We present DAGmix, a novel graph-based value factorization method.
Instead of a complete graph, DAGmix generates a dynamic graph at each time step during training.
Experiments show that DAGmix significantly outperforms previous SOTA methods in large-scale scenarios.
arXiv Detail & Related papers (2023-12-07T12:02:14Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Partially Observable Mean Field Multi-Agent Reinforcement Learning Based on Graph-Attention [12.588866091856309]
This paper considers partially observable multi-agent reinforcement learning (MARL), where each agent can only observe other agents within a fixed range.
We propose a novel multi-agent reinforcement learning algorithm, Partially Observable Mean Field Multi-Agent Reinforcement Learning based on Graph-Attention (GAMFQ)
Experiments show that GAMFQ outperforms baselines including the state-of-the-art partially observable mean-field reinforcement learning algorithms.
arXiv Detail & Related papers (2023-04-25T08:38:32Z) - A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement
Learning [7.2972297703292135]
Multiagent reinforcement learning (MARL) can solve complex cooperative tasks.
In this paper, we design a graph network called Cooperation Graph (CG)
We propose a Cooperation Graph Multiagent Reinforcement Learning (CG-MARL) algorithm, which can efficiently deal with the sparse reward problem in multiagent tasks.
arXiv Detail & Related papers (2022-08-05T06:32:16Z) - MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs [55.66953093401889]
Masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data.
Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training.
arXiv Detail & Related papers (2022-01-07T16:48:07Z) - Value Function Factorisation with Hypergraph Convolution for Cooperative
Multi-agent Reinforcement Learning [32.768661516953344]
We propose a method that combines hypergraph convolution with value decomposition.
By treating action values as signals, HGCN-Mix aims to explore the relationship between these signals via a self-learning hypergraph.
Experimental results present that HGCN-Mix matches or surpasses state-of-the-art techniques in the StarCraft II multi-agent challenge (SMAC) benchmark.
arXiv Detail & Related papers (2021-12-09T08:40:38Z) - Learning to Coordinate via Multiple Graph Neural Networks [16.226702761758595]
MGAN is a new algorithm that combines graph convolutional networks and value-decomposition methods.
We show the amazing ability of the graph network in representation learning by visualizing the output of the graph network.
arXiv Detail & Related papers (2021-04-08T04:33:00Z) - Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML.
We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML.
Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z) - Jointly Cross- and Self-Modal Graph Attention Network for Query-Based
Moment Localization [77.21951145754065]
We propose a novel Cross- and Self-Modal Graph Attention Network (CSMGAN) that recasts this task as a process of iterative messages passing over a joint graph.
Our CSMGAN is able to effectively capture high-order interactions between two modalities, thus enabling a further precise localization.
arXiv Detail & Related papers (2020-08-04T08:25:24Z) - Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion.
We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.