Multi-Agent Conditional Diffusion Model with Mean Field Communication as Wireless Resource Allocation Planner
- URL: http://arxiv.org/abs/2510.22969v1
- Date: Mon, 27 Oct 2025 03:42:18 GMT
- Title: Multi-Agent Conditional Diffusion Model with Mean Field Communication as Wireless Resource Allocation Planner
- Authors: Kechen Meng, Sinuo Zhang, Rongpeng Li, Xiangming Meng, Chan Wang, Ming Lei, Zhifeng Zhao,
- Abstract summary: In wireless communication systems, efficient and adaptive resource allocation plays a crucial role in enhancing Quality of Service (QoS)<n>In contrast, the Distributed Training with Decentralized Execution (DTDE) paradigm enables distributed learning and decision-making.<n>We propose the Multi-Agent Conditional Diffusion Model Planner (MACDMP) for decentralized communication resource management.
- Score: 16.759740918605768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In wireless communication systems, efficient and adaptive resource allocation plays a crucial role in enhancing overall Quality of Service (QoS). While centralized Multi-Agent Reinforcement Learning (MARL) frameworks rely on a central coordinator for policy training and resource scheduling, they suffer from scalability issues and privacy risks. In contrast, the Distributed Training with Decentralized Execution (DTDE) paradigm enables distributed learning and decision-making, but it struggles with non-stationarity and limited inter-agent cooperation, which can severely degrade system performance. To overcome these challenges, we propose the Multi-Agent Conditional Diffusion Model Planner (MA-CDMP) for decentralized communication resource management. Built upon the Model-Based Reinforcement Learning (MBRL) paradigm, MA-CDMP employs Diffusion Models (DMs) to capture environment dynamics and plan future trajectories, while an inverse dynamics model guides action generation, thereby alleviating the sample inefficiency and slow convergence of conventional DTDE methods. Moreover, to approximate large-scale agent interactions, a Mean-Field (MF) mechanism is introduced as an assistance to the classifier in DMs. This design mitigates inter-agent non-stationarity and enhances cooperation with minimal communication overhead in distributed settings. We further theoretically establish an upper bound on the distributional approximation error introduced by the MF-based diffusion generation, guaranteeing convergence stability and reliable modeling of multi-agent stochastic dynamics. Extensive experiments demonstrate that MA-CDMP consistently outperforms existing MARL baselines in terms of average reward and QoS metrics, showcasing its scalability and practicality for real-world wireless network optimization.
Related papers
- Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies [51.24079409973799]
Diffusion-based generative models are well-positioned to meet the needs of online Multi-Agent Reinforcement Learning (MARL)<n>We propose among the first underlineOnline off-policy underlineMARL framework using underlineDiffusion policies to orchestrate coordination.<n>Our key innovation is a relaxed policy objective that maximizes scaled joint entropy, facilitating effective exploration without relying on tractable likelihood.
arXiv Detail & Related papers (2026-02-20T15:38:02Z) - Multi-Agent Deep Reinforcement Learning Under Constrained Communications [2.7126292487109005]
We present a distributed multi-agent reinforcement learning (MARL) framework that removes the need for centralized critics or global information.<n>We develop a novel Graph Attention Network (D-GAT) that performs global state inference through multi-hop communication.<n>We also develop the distributed graph-attention MAPPO (DG-MAPPO) -- a distributed MARL framework where agents optimize local policies and value functions.
arXiv Detail & Related papers (2026-01-22T21:07:18Z) - Strategic Coordination for Evolving Multi-agent Systems: A Hierarchical Reinforcement and Collective Learning Approach [0.0]
Reinforcement learning offers a way to model sequential decision-making.<n>Agents take high-level strategies using MARL to group possible plans for action space reduction.<n>Low-level collective learning layer ensures efficient and decentralized coordinated decisions.
arXiv Detail & Related papers (2025-09-22T17:58:45Z) - HypeMARL: Multi-Agent Reinforcement Learning For High-Dimensional, Parametric, and Distributed Systems [3.072554747025686]
HypeMARL is a decentralized reinforcement learning algorithm tailored to the control of high-dimensional, parametric, and distributed systems.<n>We show that HypeMARL can effectively control systems through a collective behavior of the agents, outperforming state-of-the-art decentralized MARL.
arXiv Detail & Related papers (2025-09-20T14:42:09Z) - Latent Diffusion Model Based Denoising Receiver for 6G Semantic Communication: From Stochastic Differential Theory to Application [11.385703484113552]
We propose a novel semantic communication framework empowered by generative artificial intelligence (GAI)<n>A latent diffusion model (LDM)-based semantic communication framework is proposed that combines a variational autoencoder for semantic features extraction.<n>The proposed system is a training-free framework that supports zero-shot generalization, and achieves superior performance under low-SNR and out-of-distribution conditions.
arXiv Detail & Related papers (2025-06-06T03:20:32Z) - Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective [54.77404771454794]
We develop a flexible and robust world model for Multi-Agent Reinforcement Learning (MARL) using diffusion models.<n>Our method, Diffusion-Inspired Multi-Agent world model (DIMA), achieves state-of-the-art performance across multiple multi-agent control benchmarks.
arXiv Detail & Related papers (2025-05-27T09:11:38Z) - Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs [47.600901884970845]
This paper investigates the use of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks.<n>In particular, we consider the challenging yet more practical case where the agents heterogeneously adopt value-based or policy-based reinforcement learning algorithms to train the model.<n>We propose a heterogeneous MARL training framework, named QPMIX, which adopts a centralized training with distributed execution paradigm to enable heterogeneous agents to collaborate.
arXiv Detail & Related papers (2024-12-18T13:50:31Z) - Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models [106.35361897941898]
We propose a novel world model for Multi-Agent RL (MARL) that learns decentralized local dynamics for scalability.<n>We also introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation.<n>Results on Starcraft Multi-Agent Challenge (SMAC) show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.
arXiv Detail & Related papers (2024-06-22T12:40:03Z) - MADiff: Offline Multi-agent Learning with Diffusion Models [79.18130544233794]
MADiff is a diffusion-based multi-agent learning framework.<n>It works as both a decentralized policy and a centralized controller.<n>Our experiments demonstrate that MADiff outperforms baseline algorithms across various multi-agent learning tasks.
arXiv Detail & Related papers (2023-05-27T02:14:09Z) - Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments.
We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z) - Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search.
We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.