A Reinforcement Learning Approach to Parameter Selection for Distributed
Optimization in Power Systems
- URL: http://arxiv.org/abs/2110.11991v1
- Date: Fri, 22 Oct 2021 18:17:32 GMT
- Title: A Reinforcement Learning Approach to Parameter Selection for Distributed
Optimization in Power Systems
- Authors: Sihan Zeng, Alyssa Kody, Youngdae Kim, Kibaek Kim, Daniel K. Molzahn
- Abstract summary: We develop an adaptive penalty parameter selection policy for the AC optimal power flow (ACOPF) problem solved via ADMM.
We show that our RL policy demonstrates promise for generalizability, performing well under unseen loading schemes as well as under unseen losses of lines and generators.
This work thus provides a proof-of-concept for using RL for parameter selection in ADMM for power systems applications.
- Score: 1.1199585259018459
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing penetration of distributed energy resources, distributed
optimization algorithms have attracted significant attention for power systems
applications due to their potential for superior scalability, privacy, and
robustness to a single point-of-failure. The Alternating Direction Method of
Multipliers (ADMM) is a popular distributed optimization algorithm; however,
its convergence performance is highly dependent on the selection of penalty
parameters, which are usually chosen heuristically. In this work, we use
reinforcement learning (RL) to develop an adaptive penalty parameter selection
policy for the AC optimal power flow (ACOPF) problem solved via ADMM with the
goal of minimizing the number of iterations until convergence. We train our RL
policy using deep Q-learning, and show that this policy can result in
significantly accelerated convergence (up to a 59% reduction in the number of
iterations compared to existing, curvature-informed penalty parameter selection
methods). Furthermore, we show that our RL policy demonstrates promise for
generalizability, performing well under unseen loading schemes as well as under
unseen losses of lines and generators (up to a 50% reduction in iterations).
This work thus provides a proof-of-concept for using RL for parameter selection
in ADMM for power systems applications.
Related papers
- Reinforcement learning for anisotropic p-adaptation and error estimation in high-order solvers [0.37109226820205005]
We present a novel approach to automate and optimize anisotropic p-adaptation in high-order h/p using Reinforcement Learning (RL)
We develop an offline training approach, decoupled from the main solver, which shows minimal overcost when performing simulations.
We derive an inexpensive RL-based error estimation approach that enables the quantification of local discretization errors.
arXiv Detail & Related papers (2024-07-26T17:55:23Z) - Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer [52.09480867526656]
We identify the source of misalignment as a form of distributional shift and uncertainty in learning human preferences.
To mitigate overoptimization, we first propose a theoretical algorithm that chooses the best policy for an adversarially chosen reward model.
Using the equivalence between reward models and the corresponding optimal policy, the algorithm features a simple objective that combines a preference optimization loss and a supervised learning loss.
arXiv Detail & Related papers (2024-05-26T05:38:50Z) - Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains.
Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint.
This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions.
The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Assessment of Reinforcement Learning Algorithms for Nuclear Power Plant
Fuel Optimization [0.0]
This work presents a first-of-a-kind approach to utilize deep RL to solve the loading pattern problem and could be leveraged for any engineering design optimization.
arXiv Detail & Related papers (2023-05-09T23:51:24Z) - Offline Policy Optimization in RL with Variance Regularizaton [142.87345258222942]
We propose variance regularization for offline RL algorithms, using stationary distribution corrections.
We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer.
The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms.
arXiv Detail & Related papers (2022-12-29T18:25:01Z) - Learning Regionally Decentralized AC Optimal Power Flows with ADMM [16.843799157160063]
This paper studies how machine learning may help in speeding up the convergence of ADMM for solving AC-OPF.
It proposes a novel decentralized machine-learning approach, namely ML-ADMM, where each agent uses deep learning to learn the consensus parameters on the coupling branches.
arXiv Detail & Related papers (2022-05-08T05:30:35Z) - False Correlation Reduction for Offline Reinforcement Learning [115.11954432080749]
We propose falSe COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm.
We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL)
arXiv Detail & Related papers (2021-10-24T15:34:03Z) - OptiDICE: Offline Policy Optimization via Stationary Distribution
Correction Estimation [59.469401906712555]
We present an offline reinforcement learning algorithm that prevents overestimation in a more principled way.
Our algorithm, OptiDICE, directly estimates the stationary distribution corrections of the optimal policy.
We show that OptiDICE performs competitively with the state-of-the-art methods.
arXiv Detail & Related papers (2021-06-21T00:43:30Z) - A Reinforcement Learning Formulation of the Lyapunov Optimization:
Application to Edge Computing Systems with Queue Stability [12.693545159861857]
A deep reinforcement learning (DRL)-based approach to the Lyapunov optimization is considered to minimize the time-average penalty while maintaining queue stability.
The proposed DRL-based RL approach is applied to resource allocation in edge computing systems with queue stability and numerical results demonstrate its successful operation.
arXiv Detail & Related papers (2020-12-14T05:55:26Z) - Optimization-driven Deep Reinforcement Learning for Robust Beamforming
in IRS-assisted Wireless Communications [54.610318402371185]
Intelligent reflecting surface (IRS) is a promising technology to assist downlink information transmissions from a multi-antenna access point (AP) to a receiver.
We minimize the AP's transmit power by a joint optimization of the AP's active beamforming and the IRS's passive beamforming.
We propose a deep reinforcement learning (DRL) approach that can adapt the beamforming strategies from past experiences.
arXiv Detail & Related papers (2020-05-25T01:42:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.