Delayed Q-update: A novel credit assignment technique for deriving an
optimal operation policy for the Grid-Connected Microgrid
- URL: http://arxiv.org/abs/2006.16659v3
- Date: Tue, 20 Oct 2020 10:18:13 GMT
- Title: Delayed Q-update: A novel credit assignment technique for deriving an
optimal operation policy for the Grid-Connected Microgrid
- Authors: Hyungjun Park, Daiki Min, Jong-hyun Ryu, Dong Gu Choi
- Abstract summary: We propose an approach for deriving a desirable microgrid operation policy using the proposed novel credit assignment technique, delayed-Q update.
The technique employs novel features such as the ability to tackle and resolve the delayed effective property of the microgrid.
It supports the search for a near-optimal operation policy under a sophisticatedly controlled microgrid environment.
- Score: 3.3754780158324564
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A microgrid is an innovative system that integrates distributed energy
resources to supply electricity demand within electrical boundaries. This study
proposes an approach for deriving a desirable microgrid operation policy that
enables sophisticated controls in the microgrid system using the proposed novel
credit assignment technique, delayed-Q update. The technique employs novel
features such as the ability to tackle and resolve the delayed effective
property of the microgrid, which prevents learning agents from deriving a
well-fitted policy under sophisticated controls. The proposed technique tracks
the history of the charging period and retroactively assigns an adjusted value
to the ESS charging control. The operation policy derived using the proposed
approach is well-fitted for the real effects of ESS operation because of the
process of the technique. Therefore, it supports the search for a near-optimal
operation policy under a sophisticatedly controlled microgrid environment. To
validate our technique, we simulate the operation policy under a real-world
grid-connected microgrid system and demonstrate the convergence to a
near-optimal policy by comparing performance measures of our policy with
benchmark policy and optimal policy.
Related papers
- A novel ANROA based control approach for grid-tied multi-functional
solar energy conversion system [0.0]
An adaptive control approach for a three-phase grid-interfaced solar photovoltaic system is proposed and discussed.
This method incorporates an Adaptive Neuro-fuzzy Inference System (ANFIS) with a Rain Optimization Algorithm (ROA)
Avoiding power quality problems including voltage fluctuations, harmonics, and flickers as well as unbalanced loads and reactive power usage is the major goal.
arXiv Detail & Related papers (2024-01-26T09:12:39Z) - Towards Optimal Pricing of Demand Response -- A Nonparametric
Constrained Policy Optimization Approach [2.345728642535161]
Demand response (DR) has been demonstrated to be an effective method for reducing peak load and mitigating uncertainties on the supply and demand sides of the electricity market.
One critical question for DR research is how to appropriately adjust electricity prices in order to shift electrical load from peak to off-peak hours.
We propose an innovative nonparametric constrained policy optimization approach that improves optimality while ensuring stability of the policy update.
arXiv Detail & Related papers (2023-06-24T20:07:51Z) - Policy Search for Model Predictive Control with Application to Agile
Drone Flight [56.24908013905407]
We propose a policy-search-for-model-predictive-control framework for MPC.
Specifically, we formulate the MPC as a parameterized controller, where the hard-to-optimize decision variables are represented as high-level policies.
Experiments show that our controller achieves robust and real-time control performance in both simulation and the real world.
arXiv Detail & Related papers (2021-12-07T17:39:24Z) - Action Set Based Policy Optimization for Safe Power Grid Management [8.156111849078439]
Reinforcement learning (RL) has been employed to provide sequential decision-making in power grid management.
We propose a novel method for this problem, which builds on top of the search-based planning algorithm.
In NeurIPS 2020 Learning to Run Power Network (L2RPN) competition, our solution safely managed the power grid and ranked first in both tracks.
arXiv Detail & Related papers (2021-06-29T09:36:36Z) - Enforcing Policy Feasibility Constraints through Differentiable
Projection for Energy Optimization [57.88118988775461]
We propose PROjected Feasibility (PROF) to enforce convex operational constraints within neural policies.
We demonstrate PROF on two applications: energy-efficient building operation and inverter control.
arXiv Detail & Related papers (2021-05-19T01:58:10Z) - MPC-based Reinforcement Learning for Economic Problems with Application
to Battery Storage [0.0]
We focus on policy approximations based on Model Predictive Control (MPC)
We observe that the policy gradient method can struggle to produce meaningful steps in the policy parameters when the policy has a (nearly) bang-bang structure.
We propose a homotopy strategy based on the interior-point method, providing a relaxation of the policy during the learning.
arXiv Detail & Related papers (2021-04-06T10:37:14Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - Demand Responsive Dynamic Pricing Framework for Prosumer Dominated
Microgrids using Multiagent Reinforcement Learning [59.28219519916883]
This paper proposes a new multiagent Reinforcement Learning based decision-making environment for implementing a Real-Time Pricing (RTP) DR technique in a prosumer dominated microgrid.
The proposed technique addresses several shortcomings common to traditional DR methods and provides significant economic benefits to the grid operator and prosumers.
arXiv Detail & Related papers (2020-09-23T01:44:57Z) - Off-policy Learning for Remote Electrical Tilt Optimization [68.8204255655161]
We address the problem of Remote Electrical Tilt (RET) optimization using off-policy Contextual Multi-Armed-Bandit (CMAB) techniques.
We propose CMAB learning algorithms to extract optimal tilt update policies from the data.
Our policies show consistent improvements over the rule-based logging policy used to collect the data.
arXiv Detail & Related papers (2020-05-21T11:30:31Z) - Learning Constrained Adaptive Differentiable Predictive Control Policies
With Guarantees [1.1086440815804224]
We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems.
We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model.
arXiv Detail & Related papers (2020-04-23T14:24:44Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.