Energy Efficient Sleep Mode Optimization in 5G mmWave Networks via Multi Agent Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2511.22105v1
- Date: Thu, 27 Nov 2025 04:49:36 GMT
- Title: Energy Efficient Sleep Mode Optimization in 5G mmWave Networks via Multi Agent Deep Reinforcement Learning
- Authors: Saad Masrur, Ismail Guvenc, David Lopez Perez,
- Abstract summary: This paper proposes a multi-agent deep reinforcement learning framework using a Double Deep Q-Network (DDQN) for adaptive SMO in a 3D urban environment.<n>A realistic BS power consumption model and beamforming are integrated to accurately quantify EE, while is defined in terms of throughput.<n> Simulations show that MARL-DDQN outperforms state-of-the-art strategies, including All On, iterative-aware load-based (IT-QoS-LB), MARL-DDPG, and MARL-PPO.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic sleep mode optimization (SMO) in millimeter-wave (mmWave) networks is essential for maximizing energy efficiency (EE) under stringent quality-of-service (QoS) constraints. However, existing optimization and reinforcement learning (RL) approaches rely on aggregated, static base station (BS) traffic models that fail to capture non-stationary traffic dynamics and suffer from large state-action spaces, limiting real-world deployment. To address these challenges, this paper proposes a multi-agent deep reinforcement learning (MARL) framework using a Double Deep Q-Network (DDQN), referred to as MARL-DDQN, for adaptive SMO in a 3D urban environment with a time-varying and community-based user equipment (UE) mobility model. Unlike conventional single-agent RL, MARL-DDQN enables scalable, distributed decision-making with minimal signaling overhead. A realistic BS power consumption model and beamforming are integrated to accurately quantify EE, while QoS is defined in terms of throughput. The method adapts SMO policies to maximize EE while mitigating inter-cell interference and ensuring throughput fairness. Simulations show that MARL-DDQN outperforms state-of-the-art strategies, including All On, iterative QoS-aware load-based (IT-QoS-LB), MARL-DDPG, and MARL-PPO, achieving up to 0.60 Mbit/Joule EE, 8.5 Mbps 10th-percentile throughput, and meeting QoS constraints 95% of the time under dynamic scenarios.
Related papers
- Meta Hierarchical Reinforcement Learning for Scalable Resource Management in O-RAN [9.290879387995401]
This paper proposes an adaptive Meta Hierarchical Reinforcement Learning framework, inspired by Model Agnostic Meta Learning (MAML)<n>The framework integrates hierarchical control with meta learning to enable both global and local adaptation.<n>It achieves up to 40% faster adaptation and consistent fairness, latency, and throughput performance as network scale increases.
arXiv Detail & Related papers (2025-12-08T08:16:27Z) - PASS-Enhanced MEC: Joint Optimization of Task Offloading and Uplink PASS Beamforming [67.78883135636657]
pinching-antenna system (PASS)-enhanced mobile edge computing (MEC) architecture is investigated.<n>PASS establishes short-distance line-of-sight (LoS) links while effectively mitigating the significant path loss and potential signal blockage.<n>We formulate a network latency minimization problem to joint optimize uplink PASS beamforming and task offloading.
arXiv Detail & Related papers (2025-10-27T03:04:46Z) - Federated Multi-Agent Reinforcement Learning for Privacy-Preserving and Energy-Aware Resource Management in 6G Edge Networks [0.0]
Sixth-generation (6G) networks move toward ultra-dense, intelligent edge environments.<n> resource management under stringent privacy, mobility, and energy constraints becomes critical.<n>This paper introduces a novel Federated Multi-Agent Reinforcement Learning framework.
arXiv Detail & Related papers (2025-09-12T11:41:40Z) - DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation [58.62766376631344]
We propose a customized wireless network intent (WNI-G) model to address different state variations of wireless communication networks.
Extensive simulation achieves greater stability in spectral efficiency and variations of traditional DRL models in dynamic communication systems.
arXiv Detail & Related papers (2024-10-18T14:04:38Z) - Hyperdimensional Computing Empowered Federated Foundation Model over Wireless Networks for Metaverse [56.384390765357004]
We propose an integrated federated split learning and hyperdimensional computing framework for emerging foundation models.
This novel approach reduces communication costs, computation load, and privacy risks, making it suitable for resource-constrained edge devices in the Metaverse.
arXiv Detail & Related papers (2024-08-26T17:03:14Z) - Energy-Efficient Sleep Mode Optimization of 5G mmWave Networks Using Deep Contextual MAB [0.0]
An effective strategy to reduce this energy consumption in mobile networks is the sleep mode optimization (SMO) of base stations (BSs)
In this paper, we propose a novel SMO approach for mmWave BSs in a 3D urban environment.
Our proposed method outperforms all other SM strategies in terms of the $10th$ percentile of user rate and average throughput.
arXiv Detail & Related papers (2024-05-15T17:37:28Z) - Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [55.08287089554127]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability.<n>We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments.<n>We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z) - Deep Reinforcement Learning in mmW-NOMA: Joint Power Allocation and
Hybrid Beamforming [0.0]
High demand of data rate could be ensured by Non-Orthogonal Multiple Access (NOMA) approach in the millimetre-wave (mmW) frequency band.
Joint power allocation and hybrid beamforming of mmW-NOMA systems is brought up via recent advances in machine learning and control theory approaches.
arXiv Detail & Related papers (2022-05-13T07:55:48Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.