Enforcing Policy Feasibility Constraints through Differentiable
Projection for Energy Optimization
- URL: http://arxiv.org/abs/2105.08881v1
- Date: Wed, 19 May 2021 01:58:10 GMT
- Title: Enforcing Policy Feasibility Constraints through Differentiable
Projection for Energy Optimization
- Authors: Bingqing Chen, Priya Donti, Kyri Baker, J. Zico Kolter, Mario Berges
- Abstract summary: We propose PROjected Feasibility (PROF) to enforce convex operational constraints within neural policies.
We demonstrate PROF on two applications: energy-efficient building operation and inverter control.
- Score: 57.88118988775461
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While reinforcement learning (RL) is gaining popularity in energy systems
control, its real-world applications are limited due to the fact that the
actions from learned policies may not satisfy functional requirements or be
feasible for the underlying physical system. In this work, we propose PROjected
Feasibility (PROF), a method to enforce convex operational constraints within
neural policies. Specifically, we incorporate a differentiable projection layer
within a neural network-based policy to enforce that all learned actions are
feasible. We then update the policy end-to-end by propagating gradients through
this differentiable projection layer, making the policy cognizant of the
operational constraints. We demonstrate our method on two applications:
energy-efficient building operation and inverter control. In the building
operation setting, we show that PROF maintains thermal comfort requirements
while improving energy efficiency by 4% over state-of-the-art methods. In the
inverter control setting, PROF perfectly satisfies voltage constraints on the
IEEE 37-bus feeder system, as it learns to curtail as little renewable energy
as possible within its safety set.
Related papers
- A Safe Reinforcement Learning Algorithm for Supervisory Control of Power
Plants [7.1771300511732585]
Model-free Reinforcement learning (RL) has emerged as a promising solution for control tasks.
We propose a chance-constrained RL algorithm based on Proximal Policy Optimization for supervisory control.
Our approach achieves the smallest distance of violation and violation rate in a load-follow maneuver for an advanced Nuclear Power Plant design.
arXiv Detail & Related papers (2024-01-23T17:52:49Z) - An adaptive safety layer with hard constraints for safe reinforcement
learning in multi-energy management systems [0.0]
Safe reinforcement learning with hard constraint guarantees is a promising optimal control direction for multi-energy management systems.
We present two novel advancements: (I) combining the OptLayer and SafeFallback method, named OptLayerPolicy, to increase the initial utility.
We have shown that, in a simulated multi-energy system case study, the initial utility is increased to 92.4% (OptLayerPolicy) compared to 86.1% (OptLayer) and that the policy after training is increased to 104.9% (GreyOptLayerPolicy) compared to 103.4% (OptLayer)
arXiv Detail & Related papers (2023-04-18T10:52:16Z) - Diverse Policy Optimization for Structured Action Space [59.361076277997704]
We propose Diverse Policy Optimization (DPO) to model the policies in structured action space as the energy-based models (EBM)
A novel and powerful generative model, GFlowNet, is introduced as the efficient, diverse EBM-based policy sampler.
Experiments on ATSC and Battle benchmarks demonstrate that DPO can efficiently discover surprisingly diverse policies.
arXiv Detail & Related papers (2023-02-23T10:48:09Z) - Data-Driven Stochastic AC-OPF using Gaussian Processes [54.94701604030199]
Integrating a significant amount of renewables into a power grid is probably the most a way to reduce carbon emissions from power grids slow down climate change.
This paper presents an alternative data-driven approach based on the AC power flow equations that can incorporate uncertainty inputs.
The GP approach learns a simple yet non-constrained data-driven approach to close this gap to the AC power flow equations.
arXiv Detail & Related papers (2022-07-21T23:02:35Z) - Adversarially Robust Learning for Security-Constrained Optimal Power
Flow [55.816266355623085]
We tackle the problem of N-k security-constrained optimal power flow (SCOPF)
N-k SCOPF is a core problem for the operation of electrical grids.
Inspired by methods in adversarially robust training, we frame N-k SCOPF as a minimax optimization problem.
arXiv Detail & Related papers (2021-11-12T22:08:10Z) - Action Set Based Policy Optimization for Safe Power Grid Management [8.156111849078439]
Reinforcement learning (RL) has been employed to provide sequential decision-making in power grid management.
We propose a novel method for this problem, which builds on top of the search-based planning algorithm.
In NeurIPS 2020 Learning to Run Power Network (L2RPN) competition, our solution safely managed the power grid and ranked first in both tracks.
arXiv Detail & Related papers (2021-06-29T09:36:36Z) - Safe RAN control: A Symbolic Reinforcement Learning Approach [62.997667081978825]
We present a Symbolic Reinforcement Learning (SRL) based architecture for safety control of Radio Access Network (RAN) applications.
We provide a purely automated procedure in which a user can specify high-level logical safety specifications for a given cellular network topology.
We introduce a user interface (UI) developed to help a user set intent specifications to the system, and inspect the difference in agent proposed actions.
arXiv Detail & Related papers (2021-06-03T16:45:40Z) - Delayed Q-update: A novel credit assignment technique for deriving an
optimal operation policy for the Grid-Connected Microgrid [3.3754780158324564]
We propose an approach for deriving a desirable microgrid operation policy using the proposed novel credit assignment technique, delayed-Q update.
The technique employs novel features such as the ability to tackle and resolve the delayed effective property of the microgrid.
It supports the search for a near-optimal operation policy under a sophisticatedly controlled microgrid environment.
arXiv Detail & Related papers (2020-06-30T10:30:15Z) - Deep Constrained Q-learning [15.582910645906145]
In many real world applications, reinforcement learning agents have to optimize multiple objectives while following certain rules or satisfying a set of constraints.
We propose Constrained Q-learning, a novel off-policy reinforcement learning framework restricting the action space directly in the Q-update to learn the optimal Q-function for the induced constrained MDP and the corresponding safe policy.
arXiv Detail & Related papers (2020-03-20T17:26:03Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.