Meta Hierarchical Reinforcement Learning for Scalable Resource Management in O-RAN
- URL: http://arxiv.org/abs/2512.13715v1
- Date: Mon, 08 Dec 2025 08:16:27 GMT
- Title: Meta Hierarchical Reinforcement Learning for Scalable Resource Management in O-RAN
- Authors: Fatemeh Lotfi, Fatemeh Afghah,
- Abstract summary: This paper proposes an adaptive Meta Hierarchical Reinforcement Learning framework, inspired by Model Agnostic Meta Learning (MAML)<n>The framework integrates hierarchical control with meta learning to enable both global and local adaptation.<n>It achieves up to 40% faster adaptation and consistent fairness, latency, and throughput performance as network scale increases.
- Score: 9.290879387995401
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The increasing complexity of modern applications demands wireless networks capable of real time adaptability and efficient resource management. The Open Radio Access Network (O-RAN) architecture, with its RAN Intelligent Controller (RIC) modules, has emerged as a pivotal solution for dynamic resource management and network slicing. While artificial intelligence (AI) driven methods have shown promise, most approaches struggle to maintain performance under unpredictable and highly dynamic conditions. This paper proposes an adaptive Meta Hierarchical Reinforcement Learning (Meta-HRL) framework, inspired by Model Agnostic Meta Learning (MAML), to jointly optimize resource allocation and network slicing in O-RAN. The framework integrates hierarchical control with meta learning to enable both global and local adaptation: the high-level controller allocates resources across slices, while low level agents perform intra slice scheduling. The adaptive meta-update mechanism weights tasks by temporal difference error variance, improving stability and prioritizing complex network scenarios. Theoretical analysis establishes sublinear convergence and regret guarantees for the two-level learning process. Simulation results demonstrate a 19.8% improvement in network management efficiency compared with baseline RL and meta-RL approaches, along with faster adaptation and higher QoS satisfaction across eMBB, URLLC, and mMTC slices. Additional ablation and scalability studies confirm the method's robustness, achieving up to 40% faster adaptation and consistent fairness, latency, and throughput performance as network scale increases.
Related papers
- Task Specific Sharpness Aware O-RAN Resource Management using Multi Agent Reinforcement Learning [8.26664397566735]
Next-generation networks utilize the Open Radio Access Network (O-RAN) architecture to enable dynamic resource management.<n>Deep reinforcement learning models often struggle with robustness and generalizability in dynamic environments.<n>This paper introduces a novel resource management approach that enhances the Soft Actor Critic (SAC) algorithm with Sharpness-Aware Minimization (SAM) in a distributed Multi-Agent RL (MARL) framework.
arXiv Detail & Related papers (2025-11-19T00:55:24Z) - Power Grid Control with Graph-Based Distributed Reinforcement Learning [60.49805771047161]
This work advances a graph-based distributed reinforcement learning framework for real-time, scalable grid management.<n>A Graph Neural Network (GNN) is employed to encode the network's topological information within the single low-level agent's observation.<n>Experiments on the Grid2Op simulation environment show the effectiveness of the approach.
arXiv Detail & Related papers (2025-09-02T22:17:25Z) - AIM: Adaptive Intra-Network Modulation for Balanced Multimodal Learning [55.56234913868664]
We propose Adaptive Intra-Network Modulation (AIM) to improve balanced modality learning.<n>AIM accounts for differences in optimization state across parameters and depths within the network during modulation.<n>We show that AIM outperforms state-of-the-art imbalanced modality learning methods across multiple benchmarks.
arXiv Detail & Related papers (2025-08-27T10:53:36Z) - Dynamic Context-oriented Decomposition for Task-aware Low-rank Adaptation with Less Forgetting and Faster Convergence [131.41894248194995]
We propose context-oriented decomposition adaptation (CorDA), a novel method that initializes adapters in a task-aware manner.<n>Thanks to the task awareness, our method enables two optional adaptation modes, knowledge-preserved mode (KPM) and instruction-previewed mode (IPM)
arXiv Detail & Related papers (2025-06-16T07:55:14Z) - A Local Information Aggregation based Multi-Agent Reinforcement Learning for Robot Swarm Dynamic Task Allocation [4.144893164317513]
We introduce a novel framework using a decentralized partially observable Markov decision process (Dec_POMDP)<n>At the core of our methodology is the Local Information Aggregation Multi-Agent Deep Deterministic Policy Gradient (LIA_MADDPG) algorithm.<n>Our empirical evaluations show that the LIA module can be seamlessly integrated into various CTDE-based MARL methods.
arXiv Detail & Related papers (2024-11-29T07:53:05Z) - Meta Reinforcement Learning Approach for Adaptive Resource Optimization in O-RAN [6.326120268549892]
Open Radio Access Network (O-RAN) addresses the variable demands of modern networks with unprecedented efficiency and adaptability.
This paper proposes a novel Meta Deep Reinforcement Learning (Meta-DRL) strategy, inspired by Model-Agnostic Meta-Learning (MAML) to advance resource block and downlink power allocation in O-RAN.
arXiv Detail & Related papers (2024-09-30T23:04:30Z) - Enhancing Spectrum Efficiency in 6G Satellite Networks: A GAIL-Powered Policy Learning via Asynchronous Federated Inverse Reinforcement Learning [67.95280175998792]
A novel adversarial imitation learning (GAIL)-powered policy learning approach is proposed for optimizing beamforming, spectrum allocation, and remote user equipment (RUE) association ins.
We employ inverse RL (IRL) to automatically learn reward functions without manual tuning.
We show that the proposed MA-AL method outperforms traditional RL approaches, achieving a $14.6%$ improvement in convergence and reward value.
arXiv Detail & Related papers (2024-09-27T13:05:02Z) - Fast Context Adaptation in Cost-Aware Continual Learning [10.515324071327903]
5G and Beyond networks require more complex learning agents and the learning process itself might end up competing with users for communication and computational resources.
This creates friction: on the one hand, the learning process needs resources to quickly convergence to an effective strategy; on the other hand, the learning process needs to be efficient, i.e. take as few resources as possible from the user's data plane, so as not to throttle users' resources.
In this paper, we propose a dynamic strategy to balance the resources assigned to the data plane and those reserved for learning.
arXiv Detail & Related papers (2023-06-06T17:46:48Z) - Evolutionary Deep Reinforcement Learning for Dynamic Slice Management in
O-RAN [11.464582983164991]
New open radio access network (O-RAN) with distinguishing features such as flexible design, disaggregated virtual and programmable components, and intelligent closed-loop control was developed.
O-RAN slicing is being investigated as a critical strategy for ensuring network quality of service (QoS) in the face of changing circumstances.
This paper introduces a novel framework able to manage the network slices through provisioned resources intelligently.
arXiv Detail & Related papers (2022-08-30T17:00:53Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.