Related papers: An Analysis of Discretization Methods for Communication Learning with Multi-Agent Reinforcement Learning

An Analysis of Discretization Methods for Communication Learning with Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2204.05669v1
Date: Tue, 12 Apr 2022 09:54:58 GMT
Title: An Analysis of Discretization Methods for Communication Learning with Multi-Agent Reinforcement Learning
Authors: Astrid Vanneste, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Steven Latr\'e, Peter Hellinckx
Abstract summary: We compare several state-of-the-art discretization methods as well as two methods that have not been used for communication learning before. The best choice in discretization method greatly depends on the environment.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Communication is crucial in multi-agent reinforcement learning when agents are not able to observe the full state of the environment. The most common approach to allow learned communication between agents is the use of a differentiable communication channel that allows gradients to flow between agents as a form of feedback. However, this is challenging when we want to use discrete messages to reduce the message size since gradients cannot flow through a discrete communication channel. Previous work proposed methods to deal with this problem. However, these methods are tested in different communication learning architectures and environments, making it hard to compare them. In this paper, we compare several state-of-the-art discretization methods as well as two methods that have not been used for communication learning before. We do this comparison in the context of communication learning using gradients from other agents and perform tests on several environments. Our results show that none of the methods is best in all environments. The best choice in discretization method greatly depends on the environment. However, the discretize regularize unit (DRU), straight through DRU and the straight through gumbel softmax show the most consistent results across all the tested environments. Therefore, these methods prove to be the best choice for general use while the straight through estimator and the gumbel softmax may provide better results in specific environments but fail completely in others.

Related papers

Accelerated Stochastic ExtraGradient: Mixing Hessian and Gradient Similarity to Reduce Communication in Distributed and Federated Learning [50.382793324572845]
Distributed computing involves communication between devices, which requires solving two key problems: efficiency and privacy. In this paper, we analyze a new method that incorporates the ideas of using data similarity and clients sampling. To address privacy concerns, we apply the technique of additional noise and analyze its impact on the convergence of the proposed method.
arXiv Detail & Related papers (2024-09-22T00:49:10Z)
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks. This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics. We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z)
POGEMA: A Benchmark Platform for Cooperative Multi-Agent Pathfinding [76.67608003501479]
We introduce POGEMA, a comprehensive set of tools that includes a fast environment for learning, a problem instance generator, and a visualization toolkit. We also introduce and define an evaluation protocol that specifies a range of domain-related metrics, computed based on primary evaluation indicators. The results of this comparison, which involves a variety of state-of-the-art MARL, search-based, and hybrid methods, are presented.
arXiv Detail & Related papers (2024-07-20T16:37:21Z)
Language Rectified Flow: Advancing Diffusion Language Generation with Probabilistic Flows [53.31856123113228]
This paper proposes Language Rectified Flow (ours) Our method is based on the reformulation of the standard probabilistic flow models. Experiments and ablation studies demonstrate that our method can be general, effective, and beneficial for many NLP tasks.
arXiv Detail & Related papers (2024-03-25T17:58:22Z)
Fully Independent Communication in Multi-Agent Reinforcement Learning [4.470370168359807]
Multi-Agent Reinforcement Learning (MARL) comprises a broad area of research within the field of multi-agent systems. We investigate how independent learners in MARL that do not share parameters can communicate. Our results show that, despite the challenges, independent agents can still learn communication strategies following our method.
arXiv Detail & Related papers (2024-01-26T18:42:01Z)
An In-Depth Analysis of Discretization Methods for Communication Learning using Backpropagation with Multi-Agent Reinforcement Learning [0.0]
This paper compares several state-of-the-art discretization methods as well as a novel approach. We present COMA-DIAL, a communication learning approach based on DIAL and COMA extended with learning rate scaling and adapted exploration. Our results show that the novel ST-DRU method, proposed in this paper, achieves the best results out of all discretization methods across the different environments.
arXiv Detail & Related papers (2023-08-09T13:13:19Z)
MA-Dreamer: Coordination and communication through shared imagination [5.253168177256072]
We present MA-Dreamer, a model-based method that uses both agent-centric and global differentiable models of the environment. Our experiments show that in long-term speaker-listener tasks and in cooperative games with strong partial-observability, MA-Dreamer finds a solution that makes effective use of coordination.
arXiv Detail & Related papers (2022-04-10T13:54:26Z)
Learning Selective Communication for Multi-Agent Path Finding [18.703918339797283]
Decision Causal Communication (DCC) is a simple yet efficient model to enable agents to select neighbors to conduct communication. DCC is suitable for decentralized execution to handle large scale problems.
arXiv Detail & Related papers (2021-09-12T03:07:20Z)
Fuzzy-Rough Nearest Neighbour Approaches for Emotion Detection in Tweets [1.7581155313656314]
Social media are an essential source of meaningful data that can be used in different tasks such as sentiment analysis and emotion recognition. We develop an approach for the SemEval-2018 emotion detection task, based on the fuzzy rough nearest neighbour (FRNN) Our results are competitive with the best SemEval solutions based on more complicated deep learning methods.
arXiv Detail & Related papers (2021-07-08T12:52:47Z)
TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning [53.32740707197856]
We present a new state-of-the-art unsupervised method based on pre-trained Transformers and Sequential Denoising Auto-Encoder (TSDAE) It can achieve up to 93.1% of the performance of in-domain supervised approaches.
arXiv Detail & Related papers (2021-04-14T17:02:18Z)
DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference [150.07326223077405]
Few-shot learning is attracting much attention to mitigate data scarcity. We present a discriminative nearest neighbor classification with deep self-attention. We propose to boost the discriminative ability by transferring a natural language inference (NLI) model.
arXiv Detail & Related papers (2020-10-25T00:39:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.