Offline Reinforcement Learning and Sequence Modeling for Downlink Link Adaptation
- URL: http://arxiv.org/abs/2410.23031v1
- Date: Wed, 30 Oct 2024 14:01:31 GMT
- Title: Offline Reinforcement Learning and Sequence Modeling for Downlink Link Adaptation
- Authors: Samuele Peri, Alessio Russo, Gabor Fodor, Pablo Soldati,
- Abstract summary: Link adaption is a challenging task in the presence of mobility, fast fading, and imperfect channel quality information.
We propose three LA designs based on batch-constrained deep Q-learning, conservative Q-learning, and decision transformers.
offline RL algorithms can achieve performance of state-of-the-art online RL methods when data is collected with a proper behavioral policy.
- Score: 3.687363450234871
- License:
- Abstract: Contemporary radio access networks employ link adaption (LA) algorithms to optimize the modulation and coding schemes to adapt to the prevailing propagation conditions and are near-optimal in terms of the achieved spectral efficiency. LA is a challenging task in the presence of mobility, fast fading, and imperfect channel quality information and limited knowledge of the receiver characteristics at the transmitter, which render model-based LA algorithms complex and suboptimal. Model-based LA is especially difficult as connected user equipment devices become increasingly heterogeneous in terms of receiver capabilities, antenna configurations and hardware characteristics. Recognizing these difficulties, previous works have proposed reinforcement learning (RL) for LA, which faces deployment difficulties due to their potential negative impacts on live performance. To address this challenge, this paper considers offline RL to learn LA policies from data acquired in live networks with minimal or no intrusive effects on the network operation. We propose three LA designs based on batch-constrained deep Q-learning, conservative Q-learning, and decision transformers, showing that offline RL algorithms can achieve performance of state-of-the-art online RL methods when data is collected with a proper behavioral policy.
Related papers
- Resilient UAV Trajectory Planning via Few-Shot Meta-Offline Reinforcement Learning [5.771885923067511]
This work proposes a novel, resilient, few-shot meta-offline RL algorithm combining offline RL and model-agnostic meta-learning.
We show that the proposed few-shot meta-offline RL algorithm converges faster than baseline schemes.
It is the only algorithm that can achieve optimal joint AoI and transmission power using an offline dataset.
arXiv Detail & Related papers (2025-02-03T11:39:12Z) - Continual Task Learning through Adaptive Policy Self-Composition [54.95680427960524]
CompoFormer is a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network.
Our experiments reveal that CompoFormer outperforms conventional continual learning (CL) methods, particularly in longer task sequences.
arXiv Detail & Related papers (2024-11-18T08:20:21Z) - Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning [62.984693936073974]
Value-based reinforcement learning can learn effective policies for a wide range of multi-turn problems.
Current value-based RL methods have proven particularly challenging to scale to the setting of large language models.
We propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning problem.
arXiv Detail & Related papers (2024-11-07T21:36:52Z) - DRL Optimization Trajectory Generation via Wireless Network Intent-Guided Diffusion Models for Optimizing Resource Allocation [58.62766376631344]
We propose a customized wireless network intent (WNI-G) model to address different state variations of wireless communication networks.
Extensive simulation achieves greater stability in spectral efficiency and variations of traditional DRL models in dynamic communication systems.
arXiv Detail & Related papers (2024-10-18T14:04:38Z) - Closed-form congestion control via deep symbolic regression [1.5961908901525192]
Reinforcement Learning (RL) algorithms can handle challenges in ultra-low-latency and high throughput scenarios.
The adoption of neural network models in real deployments still poses some challenges regarding real-time inference and interpretability.
This paper proposes a methodology to deal with such challenges while maintaining the performance and generalization capabilities.
arXiv Detail & Related papers (2024-03-28T14:31:37Z) - Advancing RAN Slicing with Offline Reinforcement Learning [15.259182716723496]
This paper introduces offlineReinforcement Learning to solve the RAN slicing problem.
We show how offline RL can effectively learn near-optimal policies from sub-optimal datasets.
We also present empirical evidence of the efficacy of offline RL in adapting to various service-level requirements.
arXiv Detail & Related papers (2023-12-16T22:09:50Z) - ENOTO: Improving Offline-to-Online Reinforcement Learning with Q-Ensembles [52.34951901588738]
We propose a novel framework called ENsemble-based Offline-To-Online (ENOTO) RL.
By increasing the number of Q-networks, we seamlessly bridge offline pre-training and online fine-tuning without degrading performance.
Experimental results demonstrate that ENOTO can substantially improve the training stability, learning efficiency, and final performance of existing offline RL methods.
arXiv Detail & Related papers (2023-06-12T05:10:10Z) - Learning to Control Autonomous Fleets from Observation via Offline
Reinforcement Learning [3.9121134770873733]
We propose to formalize the control of Autonomous Mobility-on-Demand systems through the lens of offline reinforcement learning.
We show that offline RL is a promising paradigm for the application of RL-based solutions within economically-critical systems.
arXiv Detail & Related papers (2023-02-28T18:31:07Z) - A Unified Framework for Alternating Offline Model Training and Policy
Learning [62.19209005400561]
In offline model-based reinforcement learning, we learn a dynamic model from historically collected data, and utilize the learned model and fixed datasets for policy learning.
We develop an iterative offline MBRL framework, where we maximize a lower bound of the true expected return.
With the proposed unified model-policy learning framework, we achieve competitive performance on a wide range of continuous-control offline reinforcement learning datasets.
arXiv Detail & Related papers (2022-10-12T04:58:51Z) - MOORe: Model-based Offline-to-Online Reinforcement Learning [26.10368749930102]
We propose a model-based Offline-to-Online Reinforcement learning (MOORe) algorithm.
Experiment results show that our algorithm smoothly transfers from offline to online stages while enabling sample-efficient online adaption.
arXiv Detail & Related papers (2022-01-25T03:14:57Z) - Behavioral Priors and Dynamics Models: Improving Performance and Domain
Transfer in Offline RL [82.93243616342275]
We introduce Offline Model-based RL with Adaptive Behavioral Priors (MABE)
MABE is based on the finding that dynamics models, which support within-domain generalization, and behavioral priors, which support cross-domain generalization, are complementary.
In experiments that require cross-domain generalization, we find that MABE outperforms prior methods.
arXiv Detail & Related papers (2021-06-16T20:48:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.