Structure-Enhanced Deep Reinforcement Learning for Optimal Transmission
Scheduling
- URL: http://arxiv.org/abs/2211.10827v1
- Date: Sun, 20 Nov 2022 00:13:35 GMT
- Title: Structure-Enhanced Deep Reinforcement Learning for Optimal Transmission
Scheduling
- Authors: Jiazheng Chen, Wanchun Liu, Daniel E. Quevedo, Yonghui Li and Branka
Vucetic
- Abstract summary: We develop a structure-enhanced deep reinforcement learning framework for optimal scheduling of a multi-sensor remote estimation system.
In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure.
Our numerical results show that the proposed structure-enhanced DRL algorithms can save the training time by 50% and reduce the remote estimation MSE by 10% to 25%.
- Score: 47.29474858956844
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Remote state estimation of large-scale distributed dynamic processes plays an
important role in Industry 4.0 applications. In this paper, by leveraging the
theoretical results of structural properties of optimal scheduling policies, we
develop a structure-enhanced deep reinforcement learning (DRL) framework for
optimal scheduling of a multi-sensor remote estimation system to achieve the
minimum overall estimation mean-square error (MSE). In particular, we propose a
structure-enhanced action selection method, which tends to select actions that
obey the policy structure. This explores the action space more effectively and
enhances the learning efficiency of DRL agents. Furthermore, we introduce a
structure-enhanced loss function to add penalty to actions that do not follow
the policy structure. The new loss function guides the DRL to converge to the
optimal policy structure quickly. Our numerical results show that the proposed
structure-enhanced DRL algorithms can save the training time by 50% and reduce
the remote estimation MSE by 10% to 25%, when compared to benchmark DRL
algorithms.
Related papers
- Adversarial Style Transfer for Robust Policy Optimization in Deep
Reinforcement Learning [13.652106087606471]
This paper proposes an algorithm that aims to improve generalization for reinforcement learning agents by removing overfitting to confounding features.
A policy network updates its parameters to minimize the effect of such perturbations, thus staying robust while maximizing the expected future reward.
We evaluate our approach on Procgen and Distracting Control Suite for generalization and sample efficiency.
arXiv Detail & Related papers (2023-08-29T18:17:35Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Diverse Policy Optimization for Structured Action Space [59.361076277997704]
We propose Diverse Policy Optimization (DPO) to model the policies in structured action space as the energy-based models (EBM)
A novel and powerful generative model, GFlowNet, is introduced as the efficient, diverse EBM-based policy sampler.
Experiments on ATSC and Battle benchmarks demonstrate that DPO can efficiently discover surprisingly diverse policies.
arXiv Detail & Related papers (2023-02-23T10:48:09Z) - Structure-Enhanced DRL for Optimal Transmission Scheduling [43.801422320012286]
This paper focuses on the transmission scheduling problem of a remote estimation system.
We develop a structure-enhanced deep reinforcement learning framework for optimal scheduling of the system.
In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure.
arXiv Detail & Related papers (2022-12-24T10:18:38Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z) - Progressive extension of reinforcement learning action dimension for
asymmetric assembly tasks [7.4642148614421995]
In this paper, a progressive extension of action dimension (PEAD) mechanism is proposed to optimize the convergence of RL algorithms.
The results demonstrate the PEAD method will enhance the data-efficiency and time-efficiency of RL algorithms as well as increase the stable reward.
arXiv Detail & Related papers (2021-04-06T11:48:54Z) - RL-Controller: a reinforcement learning framework for active structural
control [0.0]
We present a novel RL-based approach for designing active controllers by introducing RL-Controller, a flexible and scalable simulation environment.
We show that the proposed framework is easily trainable for a five story benchmark building with 65% reductions on average in inter story drifts.
In a comparative study with LQG active control method, we demonstrate that the proposed model-free algorithm learns more optimal actuator forcing strategies.
arXiv Detail & Related papers (2021-03-13T04:42:13Z) - Implementation Matters in Deep Policy Gradients: A Case Study on PPO and
TRPO [90.90009491366273]
We study the roots of algorithmic progress in deep policy gradient algorithms through a case study on two popular algorithms.
Specifically, we investigate the consequences of "code-level optimizations:"
Our results show that they (a) are responsible for most of PPO's gain in cumulative reward over TRPO, and (b) fundamentally change how RL methods function.
arXiv Detail & Related papers (2020-05-25T16:24:59Z) - Population-Guided Parallel Policy Search for Reinforcement Learning [17.360163137926]
A new population-guided parallel learning scheme is proposed to enhance the performance of off-policy reinforcement learning (RL)
In the proposed scheme, multiple identical learners with their own value-functions and policies share a common experience replay buffer, and search a good policy in collaboration with the guidance of the best policy information.
arXiv Detail & Related papers (2020-01-09T10:13:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.