Bi-level Off-policy Reinforcement Learning for Volt/VAR Control
Involving Continuous and Discrete Devices
- URL: http://arxiv.org/abs/2104.05902v1
- Date: Tue, 13 Apr 2021 02:22:43 GMT
- Title: Bi-level Off-policy Reinforcement Learning for Volt/VAR Control
Involving Continuous and Discrete Devices
- Authors: Haotian Liu, Wenchuan Wu
- Abstract summary: In Volt/Var control, both slow timescale discrete devices (STDDs) and fast timescale continuous devices (FTCDs) are involved.
Traditional optimization methods are heavily based on accurate models of the system, but sometimes impractical because of their unaffordable effort on modelling.
In this paper, a novel bi-level off-policy reinforcement learning (RL) algorithm is proposed to solve this problem in a model-free manner.
- Score: 2.079959811127612
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Volt/Var control (VVC) of active distribution networks(ADNs), both slow
timescale discrete devices (STDDs) and fast timescale continuous devices
(FTCDs) are involved. The STDDs such as on-load tap changers (OLTC) and FTCDs
such as distributed generators should be coordinated in time sequence. Such VCC
is formulated as a two-timescale optimization problem to jointly optimize FTCDs
and STDDs in ADNs. Traditional optimization methods are heavily based on
accurate models of the system, but sometimes impractical because of their
unaffordable effort on modelling. In this paper, a novel bi-level off-policy
reinforcement learning (RL) algorithm is proposed to solve this problem in a
model-free manner. A Bi-level Markov decision process (BMDP) is defined to
describe the two-timescale VVC problem and separate agents are set up for the
slow and fast timescale sub-problems. For the fast timescale sub-problem, we
adopt an off-policy RL method soft actor-critic with high sample efficiency.
For the slow one, we develop an off-policy multi-discrete soft actor-critic
(MDSAC) algorithm to address the curse of dimensionality with various STDDs. To
mitigate the non-stationary issue existing the two agents' learning processes,
we propose a multi-timescale off-policy correction (MTOPC) method by adopting
importance sampling technique. Comprehensive numerical studies not only
demonstrate that the proposed method can achieve stable and satisfactory
optimization of both STDDs and FTCDs without any model information, but also
support that the proposed method outperforms existing two-timescale VVC
methods.
Related papers
- BiT-MamSleep: Bidirectional Temporal Mamba for EEG Sleep Staging [9.727831971306058]
BiT-MamSleep is a novel architecture that integrates the Triple-Resolution CNN (TRCNN) for efficient multi-scale feature extraction.
BiT-MamSleep incorporates an Adaptive Feature Recalibration (AFR) module and a temporal enhancement block to dynamically refine feature importance.
Experiments on four public datasets demonstrate that BiT-MamSleep significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2024-11-03T14:49:11Z) - Two-Timescale Model Caching and Resource Allocation for Edge-Enabled AI-Generated Content Services [55.0337199834612]
Generative AI (GenAI) has emerged as a transformative technology, enabling customized and personalized AI-generated content (AIGC) services.
These services require executing GenAI models with billions of parameters, posing significant obstacles to resource-limited wireless edge.
We introduce the formulation of joint model caching and resource allocation for AIGC services to balance a trade-off between AIGC quality and latency metrics.
arXiv Detail & Related papers (2024-11-03T07:01:13Z) - Augmented Lagrangian-Based Safe Reinforcement Learning Approach for Distribution System Volt/VAR Control [1.1059341532498634]
This paper formulates the Volt- VAR control problem as a constrained Markov decision process (CMDP)
A novel safe off-policy reinforcement learning (RL) approach is proposed in this paper to solve the CMDP.
A two-stage strategy is adopted for offline training and online execution, so the accurate distribution system model is no longer needed.
arXiv Detail & Related papers (2024-10-19T19:45:09Z) - Temporal Prototype-Aware Learning for Active Voltage Control on Power Distribution Networks [28.630650305620197]
Active Voltage Control (AVC) on the Power Distribution Networks (PDNs) aims to stabilize the voltage levels to ensure efficient and reliable operation of power systems.
We propose a novel temporal prototype-aware learning method, abbreviated as TPA, to learn time-adaptive dependencies under short-term training trajectories.
arXiv Detail & Related papers (2024-06-25T08:07:00Z) - Adaptive Multi-Scale Decomposition Framework for Time Series Forecasting [26.141054975797868]
We propose a novel Adaptive Multi-Scale Decomposition (AMD) framework for time series forecasting (TSF)
Our framework decomposes time series into distinct temporal patterns at multiple scales, leveraging the Multi-Scale Decomposable Mixing (MDM) block.
Our approach effectively models both temporal and channel dependencies and utilizes autocorrelation to refine multi-scale data integration.
arXiv Detail & Related papers (2024-06-06T05:27:33Z) - When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL [37.58940726230092]
Reinforcement learning (RL) excels in optimizing policies for discrete-time Markov decision processes (MDP)
We formalize an RL framework, Time-adaptive Control & Sensing (TaCoS), that tackles this challenge.
We demonstrate that state-of-the-art RL algorithms trained on TaCoS drastically reduce the interaction amount over their discrete-time counterpart.
arXiv Detail & Related papers (2024-06-03T09:57:18Z) - Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - An Accelerated Doubly Stochastic Gradient Method with Faster Explicit
Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems.
We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z) - Dynamic Network-Assisted D2D-Aided Coded Distributed Learning [59.29409589861241]
We propose a novel device-to-device (D2D)-aided coded federated learning method (D2D-CFL) for load balancing across devices.
We derive an optimal compression rate for achieving minimum processing time and establish its connection with the convergence time.
Our proposed method is beneficial for real-time collaborative applications, where the users continuously generate training data.
arXiv Detail & Related papers (2021-11-26T18:44:59Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.