Communicating Plans, Not Percepts: Scalable Multi-Agent Coordination with Embodied World Models
- URL: http://arxiv.org/abs/2508.02912v3
- Date: Tue, 04 Nov 2025 06:42:28 GMT
- Title: Communicating Plans, Not Percepts: Scalable Multi-Agent Coordination with Embodied World Models
- Authors: Brennen A. Hill, Mant Koh En Wei, Thangavel Jishnuanandh,
- Abstract summary: A central question in Multi-Agent Reinforcement Learning (MARL) is whether to engineer communication protocols or learn them end-to-end.<n>We propose and compare two communication strategies for a cooperative task-allocation problem.<n>Our experiments reveal that while emergent communication is viable in simple settings, the engineered, world model-based approach shows superior performance, sample efficiency, and scalability as complexity increases.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust coordination is critical for effective decision-making in multi-agent systems, especially under partial observability. A central question in Multi-Agent Reinforcement Learning (MARL) is whether to engineer communication protocols or learn them end-to-end. We investigate this dichotomy using embodied world models. We propose and compare two communication strategies for a cooperative task-allocation problem. The first, Learned Direct Communication (LDC), learns a protocol end-to-end. The second, Intention Communication, uses an engineered inductive bias: a compact, learned world model, the Imagined Trajectory Generation Module (ITGM), which uses the agent's own policy to simulate future states. A Message Generation Network (MGN) then compresses this plan into a message. We evaluate these approaches on goal-directed interaction in a grid world, a canonical abstraction for embodied AI problems, while scaling environmental complexity. Our experiments reveal that while emergent communication is viable in simple settings, the engineered, world model-based approach shows superior performance, sample efficiency, and scalability as complexity increases. These findings advocate for integrating structured, predictive models into MARL agents to enable active, goal-driven coordination.
Related papers
- The Five Ws of Multi-Agent Communication: Who Talks to Whom, When, What, and Why -- A Survey from MARL to Emergent Language and LLMs [17.78378587855649]
This survey reviews multi-agent communication (MA-Comm) through the Five Ws: who communicates with whom, what is communicated, when communication occurs, and why communication is beneficial.<n>We trace how communication approaches have evolved across three major paradigms.<n>We distill practical design patterns and open challenges to support future hybrid systems that combine learning, language, and control for scalable and interpretable multi-agent collaboration.
arXiv Detail & Related papers (2026-02-12T05:07:50Z) - Learning to Interact in World Latent for Team Coordination [53.51290193631586]
This work presents a novel representation learning framework, interactive world latent (IWoL), to facilitate team coordination in multi-agent reinforcement learning (MARL)<n>Our key insight is to construct a learnable representation space that jointly captures inter-agent relations and task-specific world information by directly modeling communication protocols.<n>Our representation can be used not only as an implicit latent for each agent, but also as an explicit message for communication.
arXiv Detail & Related papers (2025-09-29T22:13:39Z) - AgentMaster: A Multi-Agent Conversational Framework Using A2A and MCP Protocols for Multimodal Information Retrieval and Analysis [0.0]
We present a pilot study of AgentMaster, a novel modular multi-protocol MAS framework with self-implemented A2A and MCP.<n>The system supports natural language interaction without prior technical expertise and responds to multimodal queries for tasks including information retrieval, question answering, and image analysis.<n>Overall, our proposed framework contributes to the potential capabilities of domain-specific, cooperative, and scalable conversational AI powered by MAS.
arXiv Detail & Related papers (2025-07-08T03:34:26Z) - AnyMAC: Cascading Flexible Multi-Agent Collaboration via Next-Agent Prediction [77.62279834617475]
We propose a new framework that rethinks multi-agent coordination through a sequential structure rather than a graph structure.<n>Our method focuses on two key directions: (1) Next-Agent Prediction, which selects the most suitable agent role at each step, and (2) Next-Context Selection, which enables each agent to selectively access relevant information from any previous step.
arXiv Detail & Related papers (2025-06-21T18:34:43Z) - A Survey of AI Agent Protocols [35.431057321412354]
There is no standard way for large language models (LLMs) agents to communicate with external tools or data sources.<n>This lack of standardized protocols makes it difficult for agents to work together or scale effectively.<n>A unified communication protocol for LLM agents could change this.
arXiv Detail & Related papers (2025-04-23T14:07:26Z) - Exponential Topology-enabled Scalable Communication in Multi-agent Reinforcement Learning [9.48183472865413]
We develop a scalable communication protocol for cooperative multi-agent reinforcement learning (MARL)<n>We propose utilizing the exponential topology to enable rapid information dissemination among agents by leveraging its small-diameter and small-size properties.<n>Experiments on large-scale cooperative benchmarks, including MAgent and Infrastructure Management Planning, demonstrate the superior performance and robust zero-shot transferability of ExpoComm.
arXiv Detail & Related papers (2025-02-27T03:15:31Z) - Token Communications: A Large Model-Driven Framework for Cross-modal Context-aware Semantic Communications [78.80966346820553]
We introduce token communications (TokCom), a large model-driven framework to leverage cross-modal context information in generative semantic communications (GenSC)<n>In this paper, we introduce the potential opportunities and challenges of leveraging context in GenSC, explore how to integrate GFM/MLLMs-based token processing into semantic communication systems, present the key principles for efficient TokCom at various layers in future wireless networks.
arXiv Detail & Related papers (2025-02-17T18:14:18Z) - Networked Agents in the Dark: Team Value Learning under Partial Observability [3.8779763612314633]
We propose a novel cooperative multi-agent reinforcement learning (MARL) approach for networked agents.<n>In contrast to previous methods that rely on complete state information or joint observations, our agents must learn how to reach shared objectives under partial observability.<n>During training, they collect individual rewards and approximate a team value function through local communication, resulting in cooperative behavior.
arXiv Detail & Related papers (2025-01-15T13:01:32Z) - Communication Learning in Multi-Agent Systems from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
We introduce a temporal gating mechanism for each agent, enabling dynamic decisions on whether to receive shared information at a given time.
arXiv Detail & Related papers (2024-11-01T05:56:51Z) - Learning Multi-Agent Communication from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
Our proposed approach, CommFormer, efficiently optimize the communication graph and concurrently refines architectural parameters through gradient descent in an end-to-end manner.
arXiv Detail & Related papers (2024-05-14T12:40:25Z) - Generalising Multi-Agent Cooperation through Task-Agnostic Communication [7.380444448047908]
Existing communication methods for multi-agent reinforcement learning (MARL) in cooperative multi-robot problems are almost exclusively task-specific, training new communication strategies for each unique task.
We address this inefficiency by introducing a communication strategy applicable to any task within a given environment.
Our objective is to learn a fixed-size latent Markov state from a variable number of agent observations.
Our method enables seamless adaptation to novel tasks without fine-tuning the communication strategy, gracefully supports scaling to more agents than present during training, and detects out-of-distribution events in an environment.
arXiv Detail & Related papers (2024-03-11T14:20:13Z) - Learning Multi-Agent Communication with Contrastive Learning [3.816854668079928]
We introduce an alternative perspective where communicative messages are considered as different incomplete views of the environment state.
By examining the relationship between messages sent and received, we propose to learn to communicate using contrastive learning.
In communication-essential environments, our method outperforms previous work in both performance and learning speed.
arXiv Detail & Related papers (2023-07-03T23:51:05Z) - Centralized Training with Hybrid Execution in Multi-Agent Reinforcement
Learning [7.163485179361718]
We introduce hybrid execution in multi-agent reinforcement learning (MARL)
MARL is a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time.
We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations.
arXiv Detail & Related papers (2022-10-12T14:58:32Z) - Cooperative Policy Learning with Pre-trained Heterogeneous Observation
Representations [51.8796674904734]
We propose a new cooperative learning framework with pre-trained heterogeneous observation representations.
We employ an encoder-decoder based graph attention to learn the intricate interactions and heterogeneous representations.
arXiv Detail & Related papers (2020-12-24T04:52:29Z) - The Emergence of Adversarial Communication in Multi-Agent Reinforcement
Learning [6.18778092044887]
Many real-world problems require the coordination of multiple autonomous agents.
Recent work has shown the promise of Graph Neural Networks (GNNs) to learn explicit communication strategies that enable complex multi-agent coordination.
We show how a single self-interested agent is capable of learning highly manipulative communication strategies that allows it to significantly outperform a cooperative team of agents.
arXiv Detail & Related papers (2020-08-06T12:48:08Z) - Communication-Efficient and Distributed Learning Over Wireless Networks:
Principles and Applications [55.65768284748698]
Machine learning (ML) is a promising enabler for the fifth generation (5G) communication systems and beyond.
This article aims to provide a holistic overview of relevant communication and ML principles, and thereby present communication-efficient and distributed learning frameworks with selected use cases.
arXiv Detail & Related papers (2020-08-06T12:37:14Z) - Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z) - Learning Structured Communication for Multi-agent Reinforcement Learning [104.64584573546524]
This work explores the large-scale multi-agent communication mechanism under a multi-agent reinforcement learning (MARL) setting.
We propose a novel framework termed as Learning Structured Communication (LSC) by using a more flexible and efficient communication topology.
arXiv Detail & Related papers (2020-02-11T07:19:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.