Centralized Training with Hybrid Execution in Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2210.06274v2
- Date: Mon, 5 Jun 2023 17:35:53 GMT
- Title: Centralized Training with Hybrid Execution in Multi-Agent Reinforcement
Learning
- Authors: Pedro P. Santos, Diogo S. Carvalho, Miguel Vasco, Alberto Sardinha,
Pedro A. Santos, Ana Paiva, Francisco S. Melo
- Abstract summary: We introduce hybrid execution in multi-agent reinforcement learning (MARL)
MARL is a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time.
We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations.
- Score: 7.163485179361718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce hybrid execution in multi-agent reinforcement learning (MARL), a
new paradigm in which agents aim to successfully complete cooperative tasks
with arbitrary communication levels at execution time by taking advantage of
information-sharing among the agents. Under hybrid execution, the communication
level can range from a setting in which no communication is allowed between
agents (fully decentralized), to a setting featuring full communication (fully
centralized), but the agents do not know beforehand which communication level
they will encounter at execution time. To formalize our setting, we define a
new class of multi-agent partially observable Markov decision processes
(POMDPs) that we name hybrid-POMDPs, which explicitly model a communication
process between the agents. We contribute MARO, an approach that makes use of
an auto-regressive predictive model, trained in a centralized manner, to
estimate missing agents' observations at execution time. We evaluate MARO on
standard scenarios and extensions of previous benchmarks tailored to emphasize
the negative impact of partial observability in MARL. Experimental results show
that our method consistently outperforms relevant baselines, allowing agents to
act with faulty communication while successfully exploiting shared information.
Related papers
- Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z) - DCMAC: Demand-aware Customized Multi-Agent Communication via Upper Bound Training [9.068971933560416]
We propose a Demand-aware Customized Multi-Agent Communication protocol, which use an upper bound training to obtain the ideal policy.
Experimental results reveal that DCMAC significantly outperforms the baseline algorithms in both unconstrained and communication constrained scenarios.
arXiv Detail & Related papers (2024-09-11T09:23:27Z) - Generalising Multi-Agent Cooperation through Task-Agnostic Communication [7.380444448047908]
Existing communication methods for multi-agent reinforcement learning (MARL) in cooperative multi-robot problems are almost exclusively task-specific, training new communication strategies for each unique task.
We address this inefficiency by introducing a communication strategy applicable to any task within a given environment.
Our objective is to learn a fixed-size latent Markov state from a variable number of agent observations.
Our method enables seamless adaptation to novel tasks without fine-tuning the communication strategy, gracefully supports scaling to more agents than present during training, and detects out-of-distribution events in an environment.
arXiv Detail & Related papers (2024-03-11T14:20:13Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Multi-Agent Coordination via Multi-Level Communication [29.388570369796586]
We propose a novel multi-level communication scheme, Sequential Communication (SeqComm)
In this paper, we propose a novel multi-level communication scheme, Sequential Communication (SeqComm)
arXiv Detail & Related papers (2022-09-26T14:08:03Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Coordinating Policies Among Multiple Agents via an Intelligent
Communication Channel [81.39444892747512]
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.
We propose an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents' collective performance.
arXiv Detail & Related papers (2022-05-21T14:11:33Z) - A Decentralized Communication Framework based on Dual-Level Recurrence
for Multi-Agent Reinforcement Learning [5.220940151628735]
We present a dual-level recurrent communication framework for multi-agent systems.
The first recurrence occurs in the communication sequence and is used to transmit communication data among agents.
The second recurrence is based on the time sequence and combines the historical observations for each agent.
arXiv Detail & Related papers (2022-02-22T01:36:59Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion.
We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.