Efficient Communication via Self-supervised Information Aggregation for
Online and Offline Multi-agent Reinforcement Learning
- URL: http://arxiv.org/abs/2302.09605v1
- Date: Sun, 19 Feb 2023 16:02:16 GMT
- Title: Efficient Communication via Self-supervised Information Aggregation for
Online and Offline Multi-agent Reinforcement Learning
- Authors: Cong Guan, Feng Chen, Lei Yuan, Zongzhang Zhang, Yang Yu
- Abstract summary: We argue that efficient message aggregation is essential for good coordination in cooperative Multi-Agent Reinforcement Learning (MARL)
We propose Multi-Agent communication via Self-supervised Information Aggregation (MASIA), where agents can aggregate the received messages into compact representations with high relevance to augment the local policy.
We build offline benchmarks for multi-agent communication, which is the first as we know.
- Score: 12.334522644561591
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Utilizing messages from teammates can improve coordination in cooperative
Multi-agent Reinforcement Learning (MARL). Previous works typically combine raw
messages of teammates with local information as inputs for policy. However,
neglecting message aggregation poses significant inefficiency for policy
learning. Motivated by recent advances in representation learning, we argue
that efficient message aggregation is essential for good coordination in
cooperative MARL. In this paper, we propose Multi-Agent communication via
Self-supervised Information Aggregation (MASIA), where agents can aggregate the
received messages into compact representations with high relevance to augment
the local policy. Specifically, we design a permutation invariant message
encoder to generate common information-aggregated representation from messages
and optimize it via reconstructing and shooting future information in a
self-supervised manner. Hence, each agent would utilize the most relevant parts
of the aggregated representation for decision-making by a novel message
extraction mechanism. Furthermore, considering the potential of offline
learning for real-world applications, we build offline benchmarks for
multi-agent communication, which is the first as we know. Empirical results
demonstrate the superiority of our method in both online and offline settings.
We also release the built offline benchmarks in this paper as a testbed for
communication ability validation to facilitate further future research.
Related papers
- Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z) - Learning Multi-Agent Communication from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
Our proposed approach, CommFormer, efficiently optimize the communication graph and concurrently refines architectural parameters through gradient descent in an end-to-end manner.
arXiv Detail & Related papers (2024-05-14T12:40:25Z) - Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning [42.27106057372819]
We propose a novel multi-agent reinforcement learning algorithm that embeds large language models into agents.
The framework has a message module and an action module.
Experiments conducted on the Overcooked game demonstrate our method significantly enhances the learning efficiency and performance of existing methods.
arXiv Detail & Related papers (2024-04-27T05:10:33Z) - Pragmatic Communication in Multi-Agent Collaborative Perception [80.14322755297788]
Collaborative perception results in a trade-off between perception ability and communication costs.
We propose PragComm, a multi-agent collaborative perception system with two key components.
PragComm consistently outperforms previous methods with more than 32.7K times lower communication volume.
arXiv Detail & Related papers (2024-01-23T11:58:08Z) - Context-aware Communication for Multi-agent Reinforcement Learning [6.109127175562235]
We develop a context-aware communication scheme for multi-agent reinforcement learning (MARL)
In the first stage, agents exchange coarse representations in a broadcast fashion, providing context for the second stage.
Following this, agents utilize attention mechanisms in the second stage to selectively generate messages personalized for the receivers.
To evaluate the effectiveness of CACOM, we integrate it with both actor-critic and value-based MARL algorithms.
arXiv Detail & Related papers (2023-12-25T03:33:08Z) - RGMComm: Return Gap Minimization via Discrete Communications in
Multi-Agent Reinforcement Learning [33.86277578441437]
Communication is crucial for solving cooperative Multi-Agent Reinforcement Learning tasks in partially observable Markov Decision Processes.
We propose the Return-Gap-Minimization Communication (RGMComm) algorithm, which is a surprisingly simple design of discrete message generation functions.
Evaluations show that RGMComm significantly outperforms state-of-the-art multi-agent communication baselines.
arXiv Detail & Related papers (2023-08-07T07:26:55Z) - Scalable Communication for Multi-Agent Reinforcement Learning via
Transformer-Based Email Mechanism [9.607941773452925]
Communication can impressively improve cooperation in multi-agent reinforcement learning (MARL)
We propose a novel framework Transformer-based Email Mechanism (TEM) to tackle the scalability problem of MARL communication for partially-observed tasks.
arXiv Detail & Related papers (2023-01-05T05:34:30Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Coordinating Policies Among Multiple Agents via an Intelligent
Communication Channel [81.39444892747512]
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.
We propose an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents' collective performance.
arXiv Detail & Related papers (2022-05-21T14:11:33Z) - Multi-agent Communication with Graph Information Bottleneck under
Limited Bandwidth (a position paper) [92.11330289225981]
In many real-world scenarios, communication can be expensive and the bandwidth of the multi-agent system is subject to certain constraints.
Redundant messages who occupy the communication resources can block the transmission of informative messages and thus jeopardize the performance.
We propose a novel multi-agent communication module, CommGIB, which effectively compresses the structure information and node information in the communication graph to deal with bandwidth-constrained settings.
arXiv Detail & Related papers (2021-12-20T07:53:44Z) - Networked Multi-Agent Reinforcement Learning with Emergent Communication [18.47483427884452]
Multi-Agent Reinforcement Learning (MARL) methods find optimal policies for agents that operate in the presence of other learning agents.
One way to coordinate is by learning to communicate with each other.
Can the agents develop a language while learning to perform a common task?
arXiv Detail & Related papers (2020-04-06T16:13:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.