Generalization in Reinforcement Learning for Radio Access Networks
- URL: http://arxiv.org/abs/2507.06602v2
- Date: Fri, 18 Jul 2025 13:46:26 GMT
- Title: Generalization in Reinforcement Learning for Radio Access Networks
- Authors: Burak Demirel, Yu Wang, Cristian Tatino, Pablo Soldati,
- Abstract summary: We propose a generalization-centered RL framework for RAN control.<n>It robustly reconstructs dynamically varying states from partial and noisy observations.<n>It encodes static and semi-static information, such as radio nodes, cell attributes, and their topology, through graph representations.
- Score: 2.9822261313236513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern RAN operate in highly dynamic and heterogeneous environments, where hand-tuned, rule-based RRM algorithms often underperform. While RL can surpass such heuristics in constrained settings, the diversity of deployments and unpredictable radio conditions introduce major generalization challenges. Data-driven policies frequently overfit to training conditions, degrading performance in unseen scenarios. To address this, we propose a generalization-centered RL framework for RAN control that: (i) robustly reconstructs dynamically varying states from partial and noisy observations, while encoding static and semi-static information, such as radio nodes, cell attributes, and their topology, through graph representations; (ii) applies domain randomization to broaden the training distribution; and (iii) distributes data generation across multiple actors while centralizing training in a cloud-compatible architecture aligned with O-RAN principles. Although generalization increases computational and data-management complexity, our distributed design mitigates this by scaling data collection and training across diverse network conditions. Applied to downlink link adaptation in five 5G benchmarks, our policy improves average throughput and spectral efficiency by ~10% over an OLLA baseline (10% BLER target) in full-buffer MIMO/mMIMO and by >20% under high mobility. It matches specialized RL in full-buffer traffic and achieves up to 4- and 2-fold gains in eMBB and mixed-traffic benchmarks, respectively. In nine-cell deployments, GAT models offer 30% higher throughput over MLP baselines. These results, combined with our scalable architecture, offer a path toward AI-native 6G RAN using a single, generalizable RL agent.
Related papers
- GRAM: Generalization in Deep RL with a Robust Adaptation Module [62.662894174616895]
In this work, we present a framework for dynamics generalization in deep reinforcement learning.<n>We introduce a robust adaptation module that provides a mechanism for identifying and reacting to both in-distribution and out-of-distribution environment dynamics.<n>Our algorithm GRAM achieves strong generalization performance across in-distribution and out-of-distribution scenarios upon deployment.
arXiv Detail & Related papers (2024-12-05T16:39:01Z) - FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression [55.992528247880685]
Decentralized training faces significant challenges regarding system design and efficiency.
We present FusionLLM, a decentralized training system designed and implemented for training large deep neural networks (DNNs)
We show that our system and method can achieve 1.45 - 9.39x speedup compared to baseline methods while ensuring convergence.
arXiv Detail & Related papers (2024-10-16T16:13:19Z) - Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining [49.730897226510095]
We introduce JOWA: Jointly-Reinforced World-Action model, an offline model-based RL agent pretrained on Atari games with 6 billion tokens data.<n>Our largest agent, with 150 million parameters, 78.9% human-level performance on pretrained games using only 10% subsampled offline data, outperforming existing state-of-the-art large-scale offline RL baselines by 31.6% on averange.
arXiv Detail & Related papers (2024-10-01T10:25:03Z) - Diffusion-based Episodes Augmentation for Offline Multi-Agent Reinforcement Learning [24.501511979962746]
offline multi-agent reinforcement learning (MARL) is increasingly recognized as crucial for effectively deploying RL algorithms in environments where real-time interaction is impractical, risky, or costly.
We present EAQ, Episodes Augmentation guided by Q-total loss, a novel approach for offline MARL framework utilizing diffusion models.
arXiv Detail & Related papers (2024-08-23T14:17:17Z) - Safe and Accelerated Deep Reinforcement Learning-based O-RAN Slicing: A
Hybrid Transfer Learning Approach [20.344810727033327]
We propose and design a hybrid TL-aided approach to provide safe and accelerated convergence in DRL-based O-RAN slicing.
The proposed hybrid approach shows at least: 7.7% and 20.7% improvements in the average initial reward value and the percentage of converged scenarios.
arXiv Detail & Related papers (2023-09-13T18:58:34Z) - How Does Forecasting Affect the Convergence of DRL Techniques in O-RAN
Slicing? [20.344810727033327]
We propose a novel forecasting-aided DRL approach and its respective O-RAN practical deployment workflow to enhance DRL convergence.
Our approach shows up to 22.8%, 86.3%, and 300% improvements in the average initial reward value, convergence rate, and number of converged scenarios respectively.
arXiv Detail & Related papers (2023-09-01T14:30:04Z) - Inter-Cell Network Slicing With Transfer Learning Empowered Multi-Agent
Deep Reinforcement Learning [6.523367518762879]
Network slicing enables operators to efficiently support diverse applications on a common physical infrastructure.
The ever-increasing densification of network deployment leads to complex and non-trivial inter-cell interference.
We develop a DIRP algorithm with multiple deep reinforcement learning (DRL) agents to cooperatively optimize resource partition in individual cells.
arXiv Detail & Related papers (2023-06-20T14:14:59Z) - Programmable and Customized Intelligence for Traffic Steering in 5G
Networks Using Open RAN Architectures [16.48682480842328]
5G and beyond mobile networks will support heterogeneous use cases at an unprecedented scale.
Such fine-grained control of the Radio Access Network (RAN) is not possible with the current cellular architecture.
We propose an open architecture with abstractions that enable closed-loop control and provide data-driven, and intelligent optimization of the RAN at the user level.
arXiv Detail & Related papers (2022-09-28T15:31:06Z) - Behavioral Priors and Dynamics Models: Improving Performance and Domain
Transfer in Offline RL [82.93243616342275]
We introduce Offline Model-based RL with Adaptive Behavioral Priors (MABE)
MABE is based on the finding that dynamics models, which support within-domain generalization, and behavioral priors, which support cross-domain generalization, are complementary.
In experiments that require cross-domain generalization, we find that MABE outperforms prior methods.
arXiv Detail & Related papers (2021-06-16T20:48:49Z) - Reinforcement Learning for Datacenter Congestion Control [50.225885814524304]
Successful congestion control algorithms can dramatically improve latency and overall network throughput.
Until today, no such learning-based algorithms have shown practical potential in this domain.
We devise an RL-based algorithm with the aim of generalizing to different configurations of real-world datacenter networks.
We show that this scheme outperforms alternative popular RL approaches, and generalizes to scenarios that were not seen during training.
arXiv Detail & Related papers (2021-02-18T13:49:28Z) - Improving Generalization in Reinforcement Learning with Mixture
Regularization [113.12412071717078]
We introduce a simple approach, named mixreg, which trains agents on a mixture of observations from different training environments.
Mixreg increases the data diversity more effectively and helps learn smoother policies.
Results show mixreg outperforms the well-established baselines on unseen testing environments by a large margin.
arXiv Detail & Related papers (2020-10-21T08:12:03Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.