Related papers: Controlled Deep Reinforcement Learning for Optimized Slice Placement

Controlled Deep Reinforcement Learning for Optimized Slice Placement

URL: http://arxiv.org/abs/2108.01544v1
Date: Tue, 3 Aug 2021 14:54:00 GMT
Title: Controlled Deep Reinforcement Learning for Optimized Slice Placement
Authors: Jose Jurandir Alves Esteves, Amina Boubendir, Fabrice Guillemin, Pierre Sens
Abstract summary: We present a hybrid ML-heuristic approach that we name "Heuristically Assisted Deep Reinforcement Learning (HA-DRL)" The proposed approach leverages recent works on Deep Reinforcement Learning (DRL) for slice placement and Virtual Network Embedding (VNE) The evaluation results show that the proposed HA-DRL algorithm can accelerate the learning of an efficient slice placement policy.
Score: 0.8459686722437155
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a hybrid ML-heuristic approach that we name "Heuristically Assisted Deep Reinforcement Learning (HA-DRL)" to solve the problem of Network Slice Placement Optimization. The proposed approach leverages recent works on Deep Reinforcement Learning (DRL) for slice placement and Virtual Network Embedding (VNE) and uses a heuristic function to optimize the exploration of the action space by giving priority to reliable actions indicated by an efficient heuristic algorithm. The evaluation results show that the proposed HA-DRL algorithm can accelerate the learning of an efficient slice placement policy improving slice acceptance ratio when compared with state-of-the-art approaches that are based only on reinforcement learning.

Related papers

Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective [31.956232187102465]
This paper studies how to transfer knowledge from imperfect reward models in online RLHF.<n>We propose novel transfer learning principles and a theoretical algorithm.<n>We develop a win-rate-based transfer policy selection strategy with improved computational efficiency.
arXiv Detail & Related papers (2025-02-26T16:03:06Z)
Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback [57.6775169085215]
Risk-sensitive reinforcement learning aims to optimize policies that balance the expected reward and risk. We present a novel framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations. We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis.
arXiv Detail & Related papers (2023-07-06T08:14:54Z)
Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories. We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z)
Deep Black-Box Reinforcement Learning with Movement Primitives [15.184283143878488]
We present a new algorithm for deep reinforcement learning (RL) It is based on differentiable trust region layers, a successful on-policy deep RL algorithm. We compare our ERL algorithm to state-of-the-art step-based algorithms in many complex simulated robotic control tasks.
arXiv Detail & Related papers (2022-10-18T06:34:52Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
On the Robustness of Controlled Deep Reinforcement Learning for Slice Placement [0.8459686722437155]
We compare two Deep Reinforcement Learning algorithms: a pure DRL-based algorithm and a hybrid DRL as a hybrid DRL-heuristic algorithm. The evaluation results show that the proposed hybrid DRL-heuristic approach is more robust and reliable in case of unpredictable network load changes than pure DRL.
arXiv Detail & Related papers (2021-08-05T10:24:33Z)
On Multi-objective Policy Optimization as a Tool for Reinforcement Learning: Case Studies in Offline RL and Finetuning [24.264618706734012]
We show how to develop novel and more effective deep reinforcement learning algorithms. We focus on offline RL and finetuning as case studies. We introduce Distillation of a Mixture of Experts (DiME) We demonstrate that for offline RL, DiME leads to a simple new algorithm that outperforms state-of-the-art.
arXiv Detail & Related papers (2021-06-15T14:59:14Z)
A Heuristically Assisted Deep Reinforcement Learning Approach for Network Slice Placement [0.7885276250519428]
We introduce a hybrid placement solution based on Deep Reinforcement Learning (DRL) and a dedicated optimization based on the Power of Two Choices principle. The proposed Heuristically-Assisted DRL (HA-DRL) allows to accelerate the learning process and gain in resource usage when compared against other state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-14T10:04:17Z)
Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling. Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z)
Progressive extension of reinforcement learning action dimension for asymmetric assembly tasks [7.4642148614421995]
In this paper, a progressive extension of action dimension (PEAD) mechanism is proposed to optimize the convergence of RL algorithms. The results demonstrate the PEAD method will enhance the data-efficiency and time-efficiency of RL algorithms as well as increase the stable reward.
arXiv Detail & Related papers (2021-04-06T11:48:54Z)
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver. We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming. We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
Discrete Action On-Policy Learning with Action-Value Critic [72.20609919995086]
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension. We construct a critic to estimate action-value functions, apply it on correlated actions, and combine these critic estimated action values to control the variance of gradient estimation. These efforts result in a new discrete action on-policy RL algorithm that empirically outperforms related on-policy algorithms relying on variance control techniques.
arXiv Detail & Related papers (2020-02-10T04:23:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.