Learning RL-Policies for Joint Beamforming Without Exploration: A Batch
Constrained Off-Policy Approach
- URL: http://arxiv.org/abs/2310.08660v2
- Date: Sat, 11 Nov 2023 14:32:12 GMT
- Title: Learning RL-Policies for Joint Beamforming Without Exploration: A Batch
Constrained Off-Policy Approach
- Authors: Heasung Kim and Sravan Kumar Ankireddy
- Abstract summary: We consider the problem of network parameter cancellation optimization for networks.
We show that deploying an algorithm in the real world for exploration and learning can be achieved with the data without exploring.
- Score: 1.0080317855851213
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we consider the problem of network parameter optimization for
rate maximization. We frame this as a joint optimization problem of power
control, beam forming, and interference cancellation. We consider the setting
where multiple Base Stations (BSs) communicate with multiple user equipment
(UEs). Because of the exponential computational complexity of brute force
search, we instead solve this nonconvex optimization problem using deep
reinforcement learning (RL) techniques. Modern communication systems are
notorious for their difficulty in exactly modeling their behavior. This limits
us in using RL-based algorithms as interaction with the environment is needed
for the agent to explore and learn efficiently. Further, it is ill-advised to
deploy the algorithm in the real world for exploration and learning because of
the high cost of failure. In contrast to the previous RL-based solutions
proposed, such as deep-Q network (DQN) based control, we suggest an offline
model-based approach. We specifically consider discrete batch-constrained deep
Q-learning (BCQ) and show that performance similar to DQN can be achieved with
only a fraction of the data without exploring. This maximizes sample efficiency
and minimizes risk in deploying a new algorithm to commercial networks. We
provide the entire project resource, including code and data, at the following
link: https://github.com/Heasung-Kim/ safe-rl-deployment-for-5g.
Related papers
- Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization [20.631476379056892]
Large Language Models (LLMs) are at the forefront of this movement.
LLMs require cloud hosting, which raises issues regarding privacy, latency, and usage limitations.
We present an edge intelligence optimization problem tailored for LLM inference.
arXiv Detail & Related papers (2024-05-12T02:38:58Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Multi Agent DeepRL based Joint Power and Subchannel Allocation in IAB
networks [0.0]
Integrated Access and Backhauling (IRL) is a viable approach for meeting the unprecedented need for higher data rates of future generations.
In this paper, we show how we can use Deep Q-Learning Network to handle problems with huge action spaces associated with fractional nodes.
arXiv Detail & Related papers (2023-08-31T21:30:25Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - Implementing Reinforcement Learning Datacenter Congestion Control in NVIDIA NICs [64.26714148634228]
congestion control (CC) algorithms become extremely difficult to design.
It is currently not possible to deploy AI models on network devices due to their limited computational capabilities.
We build a computationally-light solution based on a recent reinforcement learning CC algorithm.
arXiv Detail & Related papers (2022-07-05T20:42:24Z) - Hyperparameter Tuning for Deep Reinforcement Learning Applications [0.3553493344868413]
We propose a distributed variable-length genetic algorithm framework to tune hyperparameters for various RL applications.
Our results show that with more generations, optimal solutions that require fewer training episodes and are computationally cheap while being more robust for deployment.
arXiv Detail & Related papers (2022-01-26T20:43:13Z) - RAPID-RL: A Reconfigurable Architecture with Preemptive-Exits for
Efficient Deep-Reinforcement Learning [7.990007201671364]
We propose a reconfigurable architecture with preemptive exits for efficient deep RL (RAPID-RL)
RAPID-RL enables conditional activation of preemptive layers based on the difficulty level of inputs.
We show that RAPID-RL incurs 0.34x (0.25x) number of operations (OPS) while maintaining performance above 0.88x (0.91x) on Atari (Drone navigation) tasks.
arXiv Detail & Related papers (2021-09-16T21:30:40Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.