Centralized Training with Hybrid Execution in Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2210.06274v2
- Date: Mon, 5 Jun 2023 17:35:53 GMT
- Title: Centralized Training with Hybrid Execution in Multi-Agent Reinforcement
Learning
- Authors: Pedro P. Santos, Diogo S. Carvalho, Miguel Vasco, Alberto Sardinha,
Pedro A. Santos, Ana Paiva, Francisco S. Melo
- Abstract summary: We introduce hybrid execution in multi-agent reinforcement learning (MARL)
MARL is a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time.
We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations.
- Score: 7.163485179361718
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce hybrid execution in multi-agent reinforcement learning (MARL), a
new paradigm in which agents aim to successfully complete cooperative tasks
with arbitrary communication levels at execution time by taking advantage of
information-sharing among the agents. Under hybrid execution, the communication
level can range from a setting in which no communication is allowed between
agents (fully decentralized), to a setting featuring full communication (fully
centralized), but the agents do not know beforehand which communication level
they will encounter at execution time. To formalize our setting, we define a
new class of multi-agent partially observable Markov decision processes
(POMDPs) that we name hybrid-POMDPs, which explicitly model a communication
process between the agents. We contribute MARO, an approach that makes use of
an auto-regressive predictive model, trained in a centralized manner, to
estimate missing agents' observations at execution time. We evaluate MARO on
standard scenarios and extensions of previous benchmarks tailored to emphasize
the negative impact of partial observability in MARL. Experimental results show
that our method consistently outperforms relevant baselines, allowing agents to
act with faulty communication while successfully exploiting shared information.
Related papers
- Generalising Multi-Agent Cooperation through Task-Agnostic Communication [7.380444448047908]
Existing communication methods for multi-agent reinforcement learning (MARL) in cooperative multi-robot problems are almost exclusively task-specific, training new communication strategies for each unique task.
We address this inefficiency by introducing a communication strategy applicable to any task within a given environment.
Our objective is to learn a fixed-size latent Markov state from a variable number of agent observations.
Our method enables seamless adaptation to novel tasks without fine-tuning the communication strategy, gracefully supports scaling to more agents than present during training, and detects out-of-distribution events in an environment.
arXiv Detail & Related papers (2024-03-11T14:20:13Z) - Learning to Use Tools via Cooperative and Interactive Agents [58.77710337157665]
Tool learning empowers large language models (LLMs) as agents to use external tools and extend their utility.
We propose ConAgents, a Cooperative and interactive Agents framework, which coordinates three specialized agents for tool selection, tool execution, and action calibration separately.
Our experiments on three datasets show that the LLMs, when equipped with ConAgents, outperform baselines with substantial improvement.
arXiv Detail & Related papers (2024-03-05T15:08:16Z) - Enhancing Multi-Agent Coordination through Common Operating Picture
Integration [14.927199437011044]
We present an approach to multi-agent coordination, where each agent is equipped with the capability to integrate its history of observations, actions and messages received into a Common Operating Picture (COP)
Our results demonstrate the efficacy of COP integration, and show that COP-based training leads to robust policies compared to state-of-the-art Multi-Agent Reinforcement Learning (MARL) methods when faced with out-of-distribution initial states.
arXiv Detail & Related papers (2023-11-08T15:08:55Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Coordinating Policies Among Multiple Agents via an Intelligent
Communication Channel [81.39444892747512]
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.
We propose an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents' collective performance.
arXiv Detail & Related papers (2022-05-21T14:11:33Z) - Depthwise Convolution for Multi-Agent Communication with Enhanced
Mean-Field Approximation [9.854975702211165]
We propose a new method based on local communication learning to tackle the multi-agent RL (MARL) challenge.
First, we design a new communication protocol that exploits the ability of depthwise convolution to efficiently extract local relations.
Second, we introduce the mean-field approximation into our method to reduce the scale of agent interactions.
arXiv Detail & Related papers (2022-03-06T07:42:43Z) - A Decentralized Communication Framework based on Dual-Level Recurrence
for Multi-Agent Reinforcement Learning [5.220940151628735]
We present a dual-level recurrent communication framework for multi-agent systems.
The first recurrence occurs in the communication sequence and is used to transmit communication data among agents.
The second recurrence is based on the time sequence and combines the historical observations for each agent.
arXiv Detail & Related papers (2022-02-22T01:36:59Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Monotonic Value Function Factorisation for Deep Multi-Agent
Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion.
We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.