MA-Dreamer: Coordination and communication through shared imagination
- URL: http://arxiv.org/abs/2204.04687v1
- Date: Sun, 10 Apr 2022 13:54:26 GMT
- Title: MA-Dreamer: Coordination and communication through shared imagination
- Authors: Kenzo Lobos-Tsunekawa, Akshay Srinivasan, Michael Spranger
- Abstract summary: We present MA-Dreamer, a model-based method that uses both agent-centric and global differentiable models of the environment.
Our experiments show that in long-term speaker-listener tasks and in cooperative games with strong partial-observability, MA-Dreamer finds a solution that makes effective use of coordination.
- Score: 5.253168177256072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-agent RL is rendered difficult due to the non-stationary nature of
environment perceived by individual agents. Theoretically sound methods using
the REINFORCE estimator are impeded by its high-variance, whereas
value-function based methods are affected by issues stemming from their ad-hoc
handling of situations like inter-agent communication. Methods like MADDPG are
further constrained due to their requirement of centralized critics etc. In
order to address these issues, we present MA-Dreamer, a model-based method that
uses both agent-centric and global differentiable models of the environment in
order to train decentralized agents' policies and critics using model-rollouts
a.k.a `imagination'. Since only the model-training is done off-policy,
inter-agent communication/coordination and `language emergence' can be handled
in a straight-forward manner. We compare the performance of MA-Dreamer with
other methods on two soccer-based games. Our experiments show that in long-term
speaker-listener tasks and in cooperative games with strong
partial-observability, MA-Dreamer finds a solution that makes effective use of
coordination, whereas competing methods obtain marginal scores and fail
outright, respectively. By effectively achieving coordination and communication
under more relaxed and general conditions, out method opens the door to the
study of more complex problems and population-based training.
Related papers
- PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Multi-Agent Reinforcement Learning-Based UAV Pathfinding for Obstacle Avoidance in Stochastic Environment [12.122881147337505]
We propose a novel centralized training with decentralized execution method based on multi-agent reinforcement learning.
In our approach, agents communicate only with the centralized planner to make decentralized decisions online.
We conduct multi-step value convergence in multi-agent reinforcement learning to enhance the training efficiency.
arXiv Detail & Related papers (2023-10-25T14:21:22Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Centralized Training with Hybrid Execution in Multi-Agent Reinforcement
Learning [7.163485179361718]
We introduce hybrid execution in multi-agent reinforcement learning (MARL)
MARL is a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time.
We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations.
arXiv Detail & Related papers (2022-10-12T14:58:32Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z) - Relative Distributed Formation and Obstacle Avoidance with Multi-agent
Reinforcement Learning [20.401609420707867]
We propose a distributed formation and obstacle avoidance method based on multi-agent reinforcement learning (MARL)
Our method achieves better performance regarding formation error, formation convergence rate and on-par success rate of obstacle avoidance compared with baselines.
arXiv Detail & Related papers (2021-11-14T13:02:45Z) - Learning Selective Communication for Multi-Agent Path Finding [18.703918339797283]
Decision Causal Communication (DCC) is a simple yet efficient model to enable agents to select neighbors to conduct communication.
DCC is suitable for decentralized execution to handle large scale problems.
arXiv Detail & Related papers (2021-09-12T03:07:20Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.