Related papers: Quantum-Inspired Multi Agent Reinforcement Learning for Exploration Exploitation Optimization in UAV-Assisted 6G Network Deployment

Quantum-Inspired Multi Agent Reinforcement Learning for Exploration Exploitation Optimization in UAV-Assisted 6G Network Deployment

URL: http://arxiv.org/abs/2512.20624v1
Date: Tue, 25 Nov 2025 04:35:43 GMT
Title: Quantum-Inspired Multi Agent Reinforcement Learning for Exploration Exploitation Optimization in UAV-Assisted 6G Network Deployment
Authors: Mazyar Taghavi, Javad Vahidi,
Abstract summary: This study introduces a quantum inspired framework for optimizing the exploration exploitation tradeoff in multiagent learning, applied to UAVassisted 6G network deployment.<n>We consider a cooperative scenario where ten intelligent UAVs coordinate autonomously to maximize signal coverage and support efficient network expansion under partial observability and dynamic conditions.
Score: 0.5729426778193399
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This study introduces a quantum inspired framework for optimizing the exploration exploitation tradeoff in multiagent reinforcement learning, applied to UAVassisted 6G network deployment. We consider a cooperative scenario where ten intelligent UAVs autonomously coordinate to maximize signal coverage and support efficient network expansion under partial observability and dynamic conditions. The proposed approach integrates classical MARL algorithms with quantum-inspired optimization techniques, leveraging variational quantum circuits VQCs as the core structure and employing the Quantum Approximate Optimization Algorithm QAOA as a representative VQC based method for combinatorial optimization. Complementary probabilistic modeling is incorporated through Bayesian inference, Gaussian processes, and variational inference to capture latent environmental dynamics. A centralized training with decentralized execution CTDE paradigm is adopted, where shared memory and local view grids enhance local observability among agents. Comprehensive experiments including scalability tests, sensitivity analysis, and comparisons with PPO and DDPG baselines demonstrate that the proposed framework improves sample efficiency, accelerates convergence, and enhances coverage performance while maintaining robustness. Radar chart and convergence analyses further show that QI MARL achieves a superior balance between exploration and exploitation compared to classical methods. All implementation code and supplementary materials are publicly available on GitHub to ensure reproducibility.

Related papers

QoS-Aware Hierarchical Reinforcement Learning for Joint Link Selection and Trajectory Optimization in SAGIN-Supported UAV Mobility Management [52.15690855486153]
A space-air-ground integrated network (SAGIN) has emerged as an essential architecture for enabling ubiquitous UAV connectivity.<n>This paper formulates UAV mobility management in SAGIN as a constrained multiobjective joint optimization problem.
arXiv Detail & Related papers (2025-12-17T06:22:46Z)
Sum Rate Maximization in STAR-RIS-UAV-Assisted Networks: A CA-DDPG Approach for Joint Optimization [12.38744459760065]
This paper introduces an unmanned aerial vehicle (UAV) to enhance system flexibility and proposes an optimization design for the spectrum efficiency of the STAR-RIS-UAV-assisted wireless communication system.<n>We present a deep reinforcement learning (DRL) algorithm capable of iteratively optimizing beamforming, phase shifts, and UAV positioning to maximize the system's sum rate through continuous interactions with the environment.
arXiv Detail & Related papers (2025-12-01T02:36:00Z)
Heterogeneous Multi-agent Collaboration in UAV-assisted Mobile Crowdsensing Networks [6.226837215382989]
Unmanned aerial vehicles (UAVs)-assisted mobile crowdsensing (MCS) has emerged as a promising paradigm for data collection.<n>We tackle challenges such as spectrum scarcity, device computation, and user mobility issues that hinder efficient coordination of sensing, communication, and resource allocation.
arXiv Detail & Related papers (2025-09-28T02:13:19Z)
Pareto Actor-Critic for Communication and Computation Co-Optimization in Non-Cooperative Federated Learning Services [18.291028557265864]
We introduce PAC-MCoFL, a game-theoretic multi-agent reinforcement learning (MARL) framework where SPs act as agents to jointly optimize client assignment, adaptive quantization, and resource allocation.<n>We develop PAC-MCoFL-p, a scalable variant featuring a parameterized conjecture generator that substantially reduces computational complexity with a provably bounded error.
arXiv Detail & Related papers (2025-08-22T02:09:48Z)
A Survey on Inference Optimization Techniques for Mixture of Experts Models [50.40325411764262]
Large-scale Mixture of Experts (MoE) models offer enhanced model capacity and computational efficiency through conditional computation.<n> deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency.<n>This survey analyzes optimization techniques for MoE models across the entire system stack.
arXiv Detail & Related papers (2024-12-18T14:11:15Z)
Heterogeneous Multi-Agent Reinforcement Learning for Distributed Channel Access in WLANs [47.600901884970845]
This paper investigates the use of multi-agent reinforcement learning (MARL) to address distributed channel access in wireless local area networks.<n>In particular, we consider the challenging yet more practical case where the agents heterogeneously adopt value-based or policy-based reinforcement learning algorithms to train the model.<n>We propose a heterogeneous MARL training framework, named QPMIX, which adopts a centralized training with distributed execution paradigm to enable heterogeneous agents to collaborate.
arXiv Detail & Related papers (2024-12-18T13:50:31Z)
Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.<n>We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z)
Generative AI for O-RAN Slicing: A Semi-Supervised Approach with VAE and Contrastive Learning [5.1435595246496595]
This paper introduces a novel generative AI (GAI)-driven, unified semi-supervised learning architecture for optimizing resource allocation and network slicing in O-RAN.<n>Termed Generative Semi-Supervised VAE-Contrastive Learning, our approach maximizes the weighted user equipment (UE) throughput and allocates physical resource blocks (PRBs) to enhance the quality of service for eMBB and URLLC services.
arXiv Detail & Related papers (2024-01-16T22:23:27Z)
Pointer Networks with Q-Learning for Combinatorial Optimization [55.2480439325792]
We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.
arXiv Detail & Related papers (2023-11-05T12:03:58Z)
Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML. This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z)
Decentralized MCTS via Learned Teammate Models [89.24858306636816]
We present a trainable online decentralized planning algorithm based on decentralized Monte Carlo Tree Search. We show that deep learning and convolutional neural networks can be employed to produce accurate policy approximators.
arXiv Detail & Related papers (2020-03-19T13:10:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.