One-Step Two-Critic Deep Reinforcement Learning for Inverter-based
Volt-Var Control in Active Distribution Networks
- URL: http://arxiv.org/abs/2203.16289v1
- Date: Wed, 30 Mar 2022 13:29:28 GMT
- Title: One-Step Two-Critic Deep Reinforcement Learning for Inverter-based
Volt-Var Control in Active Distribution Networks
- Authors: Qiong Liu, Ye Guo, Lirong Deng, Haotian Liu, Dongyu Li, Hongbin Sun,
Wenqi Huang
- Abstract summary: A one-step two-critic deep reinforcement learning (OSTC-DRL) approach for inverter-based volt-var control (IB-VVC) in active distribution networks is proposed in this paper.
- Score: 15.667021542703564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A one-step two-critic deep reinforcement learning (OSTC-DRL) approach for
inverter-based volt-var control (IB-VVC) in active distribution networks is
proposed in this paper. Firstly, considering IB-VVC can be formulated as a
single-period optimization problem, we formulate the IB-VVC as a one-step
Markov decision process rather than the standard Markov decision process, which
simplifies the DRL learning task. Then we design the one-step actor-critic DRL
scheme which is a simplified version of recent DRL algorithms, and it avoids
the issue of Q value overestimation successfully. Furthermore, considering two
objectives of VVC: minimizing power loss and eliminating voltage violation, we
utilize two critics to approximate the rewards of two objectives separately. It
simplifies the approximation tasks of each critic, and avoids the interaction
effect between two objectives in the learning process of critic. The OSTC-DRL
approach integrates the one-step actor-critic DRL scheme and the two-critic
technology. Based on the OSTC-DRL, we design two centralized DRL algorithms.
Further, we extend the OSTC-DRL to multi-agent OSTC-DRL for decentralized
IB-VVC and design two multi-agent DRL algorithms. Simulations demonstrate that
the proposed OSTC-DRL has a faster convergence rate and a better control
performance, and the multi-agent OSTC-DRL works well for decentralized IB-VVC
problems.
Related papers
- Multiobjective Vehicle Routing Optimization with Time Windows: A Hybrid Approach Using Deep Reinforcement Learning and NSGA-II [52.083337333478674]
This paper proposes a weight-aware deep reinforcement learning (WADRL) approach designed to address the multiobjective vehicle routing problem with time windows (MOVRPTW)
The Non-dominated sorting genetic algorithm-II (NSGA-II) method is then employed to optimize the outcomes produced by the WADRL.
arXiv Detail & Related papers (2024-07-18T02:46:06Z) - Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning [0.3562485774739681]
We introduce the use of reinforcement learning (RL) algorithms for intelligent control in nuclear microreactors.
RL agent is trained using proximal policy optimization (PPO) and advantage actor-critic (A2C)
Results demonstrate the excellent performance of PPO in identifying optimal drum positions.
arXiv Detail & Related papers (2024-06-22T20:14:56Z) - Safe and Accelerated Deep Reinforcement Learning-based O-RAN Slicing: A
Hybrid Transfer Learning Approach [20.344810727033327]
We propose and design a hybrid TL-aided approach to provide safe and accelerated convergence in DRL-based O-RAN slicing.
The proposed hybrid approach shows at least: 7.7% and 20.7% improvements in the average initial reward value and the percentage of converged scenarios.
arXiv Detail & Related papers (2023-09-13T18:58:34Z) - DL-DRL: A double-level deep reinforcement learning approach for
large-scale task scheduling of multi-UAV [65.07776277630228]
We propose a double-level deep reinforcement learning (DL-DRL) approach based on a divide and conquer framework (DCF)
Particularly, we design an encoder-decoder structured policy network in our upper-level DRL model to allocate the tasks to different UAVs.
We also exploit another attention based policy network in our lower-level DRL model to construct the route for each UAV, with the objective to maximize the number of executed tasks.
arXiv Detail & Related papers (2022-08-04T04:35:53Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Optimization for Master-UAV-powered Auxiliary-Aerial-IRS-assisted IoT
Networks: An Option-based Multi-agent Hierarchical Deep Reinforcement
Learning Approach [56.84948632954274]
This paper investigates a master unmanned aerial vehicle (MUAV)-powered Internet of Things (IoT) network.
We propose using a rechargeable auxiliary UAV (AUAV) equipped with an intelligent reflecting surface (IRS) to enhance the communication signals from the MUAV.
Under the proposed model, we investigate the optimal collaboration strategy of these energy-limited UAVs to maximize the accumulated throughput of the IoT network.
arXiv Detail & Related papers (2021-12-20T15:45:28Z) - URLB: Unsupervised Reinforcement Learning Benchmark [82.36060735454647]
We introduce the Unsupervised Reinforcement Learning Benchmark (URLB)
URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards.
We provide twelve continuous control tasks from three domains for evaluation and open-source code for eight leading unsupervised RL methods.
arXiv Detail & Related papers (2021-10-28T15:07:01Z) - DRL-based Slice Placement Under Non-Stationary Conditions [0.8459686722437155]
We consider online learning for optimal network slice placement under the assumption that slice requests arrive according to a non-stationary process.
We specifically propose two pure-DRL algorithms and two families of hybrid DRL-heuristic algorithms.
We show that the proposed hybrid DRL-heuristic algorithms require three orders of magnitude of learning episodes less than pure-DRL to achieve convergence.
arXiv Detail & Related papers (2021-08-05T10:05:12Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - Bi-level Off-policy Reinforcement Learning for Volt/VAR Control
Involving Continuous and Discrete Devices [2.079959811127612]
In Volt/Var control, both slow timescale discrete devices (STDDs) and fast timescale continuous devices (FTCDs) are involved.
Traditional optimization methods are heavily based on accurate models of the system, but sometimes impractical because of their unaffordable effort on modelling.
In this paper, a novel bi-level off-policy reinforcement learning (RL) algorithm is proposed to solve this problem in a model-free manner.
arXiv Detail & Related papers (2021-04-13T02:22:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.