Correcting Experience Replay for Multi-Agent Communication
- URL: http://arxiv.org/abs/2010.01192v2
- Date: Sun, 28 Feb 2021 22:42:12 GMT
- Title: Correcting Experience Replay for Multi-Agent Communication
- Authors: Sanjeevan Ahilan, Peter Dayan
- Abstract summary: We consider the problem of learning to communicate using multi-agent reinforcement learning (MARL)
A common approach is to learn off-policy, using data sampled from a replay buffer.
We introduce a 'communication correction' which accounts for the non-stationarity of observed communication induced by MARL.
- Score: 18.12281605882891
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of learning to communicate using multi-agent
reinforcement learning (MARL). A common approach is to learn off-policy, using
data sampled from a replay buffer. However, messages received in the past may
not accurately reflect the current communication policy of each agent, and this
complicates learning. We therefore introduce a 'communication correction' which
accounts for the non-stationarity of observed communication induced by
multi-agent learning. It works by relabelling the received message to make it
likely under the communicator's current policy, and thus be a better reflection
of the receiver's current environment. To account for cases in which agents are
both senders and receivers, we introduce an ordered relabelling scheme. Our
correction is computationally efficient and can be integrated with a range of
off-policy algorithms. We find in our experiments that it substantially
improves the ability of communicating MARL systems to learn across a variety of
cooperative and competitive tasks.
Related papers
- DCMAC: Demand-aware Customized Multi-Agent Communication via Upper Bound Training [9.068971933560416]
We propose a Demand-aware Customized Multi-Agent Communication protocol, which use an upper bound training to obtain the ideal policy.
Experimental results reveal that DCMAC significantly outperforms the baseline algorithms in both unconstrained and communication constrained scenarios.
arXiv Detail & Related papers (2024-09-11T09:23:27Z) - Batch Selection and Communication for Active Learning with Edge Labeling [54.64985724916654]
Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD)
This work introduces Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD)
arXiv Detail & Related papers (2023-11-14T10:23:00Z) - An In-Depth Analysis of Discretization Methods for Communication
Learning using Backpropagation with Multi-Agent Reinforcement Learning [0.0]
This paper compares several state-of-the-art discretization methods as well as a novel approach.
We present COMA-DIAL, a communication learning approach based on DIAL and COMA extended with learning rate scaling and adapted exploration.
Our results show that the novel ST-DRU method, proposed in this paper, achieves the best results out of all discretization methods across the different environments.
arXiv Detail & Related papers (2023-08-09T13:13:19Z) - AC2C: Adaptively Controlled Two-Hop Communication for Multi-Agent
Reinforcement Learning [4.884877440051105]
We propose a novel communication protocol called Adaptively Controlled Two-Hop Communication (AC2C)
AC2C employs an adaptive two-hop communication strategy to enable long-range information exchange among agents to boost performance.
We evaluate AC2C on three cooperative multi-agent tasks, and the experimental results show that it outperforms relevant baselines with lower communication costs.
arXiv Detail & Related papers (2023-02-24T09:00:34Z) - Scalable Communication for Multi-Agent Reinforcement Learning via
Transformer-Based Email Mechanism [9.607941773452925]
Communication can impressively improve cooperation in multi-agent reinforcement learning (MARL)
We propose a novel framework Transformer-based Email Mechanism (TEM) to tackle the scalability problem of MARL communication for partially-observed tasks.
arXiv Detail & Related papers (2023-01-05T05:34:30Z) - Over-communicate no more: Situated RL agents learn concise communication
protocols [78.28898217947467]
It is unclear how to design artificial agents that can learn to effectively and efficiently communicate with each other.
Much research on communication emergence uses reinforcement learning (RL)
We explore situated communication in a multi-step task, where the acting agent has to forgo an environmental action to communicate.
We find that while all tested pressures can disincentivise over-communication, situated communication does it most effectively and, unlike the cost on effort, does not negatively impact emergence.
arXiv Detail & Related papers (2022-11-02T21:08:14Z) - Certifiably Robust Policy Learning against Adversarial Communication in
Multi-agent Systems [51.6210785955659]
Communication is important in many multi-agent reinforcement learning (MARL) problems for agents to share information and make good decisions.
However, when deploying trained communicative agents in a real-world application where noise and potential attackers exist, the safety of communication-based policies becomes a severe issue that is underexplored.
In this work, we consider an environment with $N$ agents, where the attacker may arbitrarily change the communication from any $CfracN-12$ agents to a victim agent.
arXiv Detail & Related papers (2022-06-21T07:32:18Z) - Coordinating Policies Among Multiple Agents via an Intelligent
Communication Channel [81.39444892747512]
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.
We propose an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents' collective performance.
arXiv Detail & Related papers (2022-05-21T14:11:33Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Learning to Communicate and Correct Pose Errors [75.03747122616605]
We study the setting proposed in V2VNet, where nearby self-driving vehicles jointly perform object detection and motion forecasting in a cooperative manner.
We propose a novel neural reasoning framework that learns to communicate, to estimate potential errors, and to reach a consensus about those errors.
arXiv Detail & Related papers (2020-11-10T18:19:40Z) - Learning Individually Inferred Communication for Multi-Agent Cooperation [37.56115000150748]
We propose Individually Inferred Communication (I2C) to enable agents to learn a prior for agent-agent communication.
The prior knowledge is learned via causal inference and realized by a feed-forward neural network.
I2C can not only reduce communication overhead but also improve the performance in a variety of multi-agent cooperative scenarios.
arXiv Detail & Related papers (2020-06-11T14:07:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.