Optimal Scheduling in IoT-Driven Smart Isolated Microgrids Based on Deep
Reinforcement Learning
- URL: http://arxiv.org/abs/2305.00127v1
- Date: Fri, 28 Apr 2023 23:52:50 GMT
- Title: Optimal Scheduling in IoT-Driven Smart Isolated Microgrids Based on Deep
Reinforcement Learning
- Authors: Jiaju Qi, Lei Lei, Kan Zheng, Simon X. Yang, Xuemin (Sherman) Shen
- Abstract summary: We investigate the scheduling issue of diesel generators (DGs) in an Internet of Things-Driven microgrid (MG) by deep reinforcement learning (DRL)
The DRL agent learns an optimal policy from history renewable and load data of previous days.
The goal is to reduce operating cost on the premise of ensuring supply-demand balance.
- Score: 10.924928763380624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the scheduling issue of diesel generators (DGs)
in an Internet of Things (IoT)-Driven isolated microgrid (MG) by deep
reinforcement learning (DRL). The renewable energy is fully exploited under the
uncertainty of renewable generation and load demand. The DRL agent learns an
optimal policy from history renewable and load data of previous days, where the
policy can generate real-time decisions based on observations of past renewable
and load data of previous hours collected by connected sensors. The goal is to
reduce operating cost on the premise of ensuring supply-demand balance. In
specific, a novel finite-horizon partial observable Markov decision process
(POMDP) model is conceived considering the spinning reserve. In order to
overcome the challenge of discrete-continuous hybrid action space due to the
binary DG switching decision and continuous energy dispatch (ED) decision, a
DRL algorithm, namely the hybrid action finite-horizon RDPG (HAFH-RDPG), is
proposed. HAFH-RDPG seamlessly integrates two classical DRL algorithms, i.e.,
deep Q-network (DQN) and recurrent deterministic policy gradient (RDPG), based
on a finite-horizon dynamic programming (DP) framework. Extensive experiments
are performed with real-world data in an IoT-driven MG to evaluate the
capability of the proposed algorithm in handling the uncertainty due to
inter-hour and inter-day power fluctuation and to compare its performance with
those of the benchmark algorithms.
Related papers
- Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Lyapunov-Driven Deep Reinforcement Learning for Edge Inference Empowered
by Reconfigurable Intelligent Surfaces [30.1512069754603]
We propose a novel algorithm for energy-efficient, low-latency, accurate inference at the wireless edge.
We consider a scenario where new data are continuously generated/collected by a set of devices and are handled through a dynamic queueing system.
arXiv Detail & Related papers (2023-05-18T12:46:42Z) - Differentially Private Deep Q-Learning for Pattern Privacy Preservation
in MEC Offloading [76.0572817182483]
attackers may eavesdrop on the offloading decisions to infer the edge server's (ES's) queue information and users' usage patterns.
We propose an offloading strategy which jointly minimizes the latency, ES's energy consumption, and task dropping rate, while preserving pattern privacy (PP)
We develop a Differential Privacy Deep Q-learning based Offloading (DP-DQO) algorithm to solve this problem while addressing the PP issue by injecting noise into the generated offloading decisions.
arXiv Detail & Related papers (2023-02-09T12:50:18Z) - Optimal Planning of Hybrid Energy Storage Systems using Curtailed
Renewable Energy through Deep Reinforcement Learning [0.0]
We propose a sophisticated deep reinforcement learning (DRL) methodology with a policy-based algorithm to plan energy storage systems (ESS)
A quantitative performance comparison proved that the DRL agent outperforms the scenario-based optimization (SO) algorithm.
The corresponding results confirmed that the DRL agent learns the way like what a human expert would do, suggesting reliable application of the proposed methodology.
arXiv Detail & Related papers (2022-12-12T02:24:50Z) - Distributed Energy Management and Demand Response in Smart Grids: A
Multi-Agent Deep Reinforcement Learning Framework [53.97223237572147]
This paper presents a multi-agent Deep Reinforcement Learning (DRL) framework for autonomous control and integration of renewable energy resources into smart power grid systems.
In particular, the proposed framework jointly considers demand response (DR) and distributed energy management (DEM) for residential end-users.
arXiv Detail & Related papers (2022-11-29T01:18:58Z) - Joint Energy Dispatch and Unit Commitment in Microgrids Based on Deep
Reinforcement Learning [6.708717040312532]
In this paper, deep reinforcement learning (DRL) is applied to learn an optimal policy for making joint energy dispatch (ED) and unit commitment (UC) decisions in an isolated microgrid.
We propose a DRL algorithm, i.e., the hybrid action finite-horizon DDPG (HAFH-DDPG), that seamlessly integrates two classical DRL algorithms.
A diesel generator (DG) selection strategy is presented to support a simplified action space for reducing the computation complexity of this algorithm.
arXiv Detail & Related papers (2022-06-03T16:22:03Z) - Deep Reinforcement Learning Based Multidimensional Resource Management
for Energy Harvesting Cognitive NOMA Communications [64.1076645382049]
Combination of energy harvesting (EH), cognitive radio (CR), and non-orthogonal multiple access (NOMA) is a promising solution to improve energy efficiency.
In this paper, we study the spectrum, energy, and time resource management for deterministic-CR-NOMA IoT systems.
arXiv Detail & Related papers (2021-09-17T08:55:48Z) - Optimizing the Long-Term Average Reward for Continuing MDPs: A Technical
Report [117.23323653198297]
We have struck the balance between the information freshness, experienced by users and energy consumed by sensors.
We cast the corresponding status update procedure as a continuing Markov Decision Process (MDP)
To circumvent the curse of dimensionality, we have established a methodology for designing deep reinforcement learning (DRL) algorithms.
arXiv Detail & Related papers (2021-04-13T12:29:55Z) - Modular Deep Reinforcement Learning for Continuous Motion Planning with
Temporal Logic [59.94347858883343]
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP)
The novelty is to design an embedded product MDP (EP-MDP) between the LDGBA and the MDP.
The proposed LDGBA-based reward shaping and discounting schemes for the model-free reinforcement learning (RL) only depend on the EP-MDP states.
arXiv Detail & Related papers (2021-02-24T01:11:25Z) - Dynamic Energy Dispatch Based on Deep Reinforcement Learning in
IoT-Driven Smart Isolated Microgrids [8.623472323825556]
Microgrids (MGs) are small, local power grids that can operate independently from the larger utility grid.
This paper focuses on deep reinforcement learning (DRL)-based energy dispatch for IoT-driven smart isolated MGs.
Two novel DRL algorithms are proposed to derive energy dispatch policies with and without fully observable state information.
arXiv Detail & Related papers (2020-02-07T01:44:18Z) - Stacked Auto Encoder Based Deep Reinforcement Learning for Online
Resource Scheduling in Large-Scale MEC Networks [44.40722828581203]
An online resource scheduling framework is proposed for minimizing the sum of weighted task latency for all the Internet of things (IoT) users.
A deep reinforcement learning (DRL) based solution is proposed, which includes the following components.
A preserved and prioritized experience replay (2p-ER) is introduced to assist the DRL to train the policy network and find the optimal offloading policy.
arXiv Detail & Related papers (2020-01-24T23:01:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.