Reinforcement Learning Based Cooperative Coded Caching under Dynamic
Popularities in Ultra-Dense Networks
- URL: http://arxiv.org/abs/2003.03758v1
- Date: Sun, 8 Mar 2020 10:45:45 GMT
- Title: Reinforcement Learning Based Cooperative Coded Caching under Dynamic
Popularities in Ultra-Dense Networks
- Authors: Shen Gao, Peihao Dong, Zhiwen Pan, Geoffrey Ye Li
- Abstract summary: caching strategy at small base stations (SBSs) is critical to meet massive high data rate requests.
We exploit reinforcement learning (RL) to design a cooperative caching strategy with maximum-distance separable (MDS) coding.
- Score: 38.44125997148742
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For ultra-dense networks with wireless backhaul, caching strategy at small
base stations (SBSs), usually with limited storage, is critical to meet massive
high data rate requests. Since the content popularity profile varies with time
in an unknown way, we exploit reinforcement learning (RL) to design a
cooperative caching strategy with maximum-distance separable (MDS) coding. We
model the MDS coding based cooperative caching as a Markov decision process to
capture the popularity dynamics and maximize the long-term expected cumulative
traffic load served directly by the SBSs without accessing the macro base
station. For the formulated problem, we first find the optimal solution for a
small-scale system by embedding the cooperative MDS coding into Q-learning. To
cope with the large-scale case, we approximate the state-action value function
heuristically. The approximated function includes only a small number of
learnable parameters and enables us to propose a fast and efficient
action-selection approach, which dramatically reduces the complexity. Numerical
results verify the optimality/near-optimality of the proposed RL based
algorithms and show the superiority compared with the baseline schemes. They
also exhibit good robustness to different environments.
Related papers
- A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC
Orchestration [12.914011030970814]
Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) over commodity platforms to offer low-cost deployment.
In this paper, a joint O-RAN/MEC orchestration using a Bayesian deep reinforcement learning (RL)-based framework is proposed.
arXiv Detail & Related papers (2023-12-26T18:04:49Z) - Learning RL-Policies for Joint Beamforming Without Exploration: A Batch
Constrained Off-Policy Approach [1.0080317855851213]
We consider the problem of network parameter cancellation optimization for networks.
We show that deploying an algorithm in the real world for exploration and learning can be achieved with the data without exploring.
arXiv Detail & Related papers (2023-10-12T18:36:36Z) - A Meta-Learning Based Precoder Optimization Framework for Rate-Splitting
Multiple Access [53.191806757701215]
We propose the use of a meta-learning based precoder optimization framework to directly optimize the Rate-Splitting Multiple Access (RSMA) precoders with partial Channel State Information at the Transmitter (CSIT)
By exploiting the overfitting of the compact neural network to maximize the explicit Average Sum-Rate (ASR) expression, we effectively bypass the need for any other training data while minimizing the total running time.
Numerical results reveal that the meta-learning based solution achieves similar ASR performance to conventional precoder optimization in medium-scale scenarios, and significantly outperforms sub-optimal low complexity precoder algorithms in the large-scale
arXiv Detail & Related papers (2023-07-17T20:31:41Z) - Combining Multi-Objective Bayesian Optimization with Reinforcement Learning for TinyML [4.2019872499238256]
We propose a novel strategy for deploying Deep Neural Networks on microcontrollers (TinyML) based on Multi-Objective Bayesian optimization (MOBOpt)
Our methodology aims at efficiently finding tradeoffs between a DNN's predictive accuracy, memory consumption on a given target system, and computational complexity.
arXiv Detail & Related papers (2023-05-23T14:31:52Z) - MARLIN: Soft Actor-Critic based Reinforcement Learning for Congestion
Control in Real Networks [63.24965775030673]
We propose a novel Reinforcement Learning (RL) approach to design generic Congestion Control (CC) algorithms.
Our solution, MARLIN, uses the Soft Actor-Critic algorithm to maximize both entropy and return.
We trained MARLIN on a real network with varying background traffic patterns to overcome the sim-to-real mismatch.
arXiv Detail & Related papers (2023-02-02T18:27:20Z) - SDQ: Stochastic Differentiable Quantization with Mixed Precision [46.232003346732064]
We present a novel Differentiable Quantization (SDQ) method that can automatically learn the MPQ strategy.
After the optimal MPQ strategy is acquired, we train our network with entropy-aware bin regularization and knowledge distillation.
SDQ outperforms all state-of-the-art mixed datasets or single precision quantization with a lower bitwidth.
arXiv Detail & Related papers (2022-06-09T12:38:18Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Learning from Images: Proactive Caching with Parallel Convolutional
Neural Networks [94.85780721466816]
A novel framework for proactive caching is proposed in this paper.
It combines model-based optimization with data-driven techniques by transforming an optimization problem into a grayscale image.
Numerical results show that the proposed scheme can reduce 71.6% computation time with only 0.8% additional performance cost.
arXiv Detail & Related papers (2021-08-15T21:32:47Z) - Caching Placement and Resource Allocation for Cache-Enabling UAV NOMA
Networks [87.6031308969681]
This article investigates the cache-enabling unmanned aerial vehicle (UAV) cellular networks with massive access capability supported by non-orthogonal multiple access (NOMA)
We formulate the long-term caching placement and resource allocation optimization problem for content delivery delay minimization as a Markov decision process (MDP)
We propose a Q-learning based caching placement and resource allocation algorithm, where the UAV learns and selects action with emphsoft $varepsilon$-greedy strategy to search for the optimal match between actions and states.
arXiv Detail & Related papers (2020-08-12T08:33:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.