Distributional Reinforcement Learning-based Energy Arbitrage Strategies
in Imbalance Settlement Mechanism
- URL: http://arxiv.org/abs/2401.00015v1
- Date: Sat, 23 Dec 2023 15:38:31 GMT
- Title: Distributional Reinforcement Learning-based Energy Arbitrage Strategies
in Imbalance Settlement Mechanism
- Authors: Seyed Soroush Karimi Madahi, Bert Claessens, Chris Develder
- Abstract summary: Growth in the penetration of renewable energy sources makes supply more uncertain and leads to an increase in the system imbalance.
We propose a battery control framework based on distributional reinforcement learning (DRL)
Our proposed control framework takes a risk-sensitive perspective, allowing BRPs to adjust their risk preferences.
- Score: 6.520803851931361
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Growth in the penetration of renewable energy sources makes supply more
uncertain and leads to an increase in the system imbalance. This trend,
together with the single imbalance pricing, opens an opportunity for balance
responsible parties (BRPs) to perform energy arbitrage in the imbalance
settlement mechanism. To this end, we propose a battery control framework based
on distributional reinforcement learning (DRL). Our proposed control framework
takes a risk-sensitive perspective, allowing BRPs to adjust their risk
preferences: we aim to optimize a weighted sum of the arbitrage profit and a
risk measure while constraining the daily number of cycles for the battery. We
assess the performance of our proposed control framework using the Belgian
imbalance prices of 2022 and compare two state-of-the-art RL methods, deep Q
learning and soft actor-critic. Results reveal that the distributional soft
actor-critic method can outperform other methods. Moreover, we note that our
fully risk-averse agent appropriately learns to hedge against the risk related
to the unknown imbalance price by (dis)charging the battery only when the agent
is more certain about the price.
Related papers
- Predicting and Publishing Accurate Imbalance Prices Using Monte Carlo Tree Search [4.950434218152639]
We propose a Monte Carlo Tree Search method that publishes accurate imbalance prices while accounting for potential response actions.
Our approach models the system dynamics using a neural network forecaster and a cluster of virtual batteries controlled by reinforcement learning agents.
arXiv Detail & Related papers (2024-11-06T15:49:28Z) - Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies [4.950434218152639]
We propose a new RL-based control framework for batteries to obtain a safe energy arbitrage strategy in the imbalance settlement mechanism.
We use the Belgian imbalance price of 2023 to evaluate the performance of our proposed framework.
arXiv Detail & Related papers (2024-04-29T16:03:21Z) - Probabilistic forecasting of power system imbalance using neural network-based ensembles [4.573008040057806]
We propose an ensemble of C-VSNs, which are our adaptation of variable selection networks (VSNs)
Each minute, our model predicts the imbalance of the current and upcoming two quarter-hours, along with uncertainty estimations on these forecasts.
For high imbalance magnitude situations, our model outperforms the state-of-the-art by 23.4%.
arXiv Detail & Related papers (2024-04-23T08:42:35Z) - Risk-Sensitive RL with Optimized Certainty Equivalents via Reduction to
Standard RL [48.1726560631463]
We study Risk-Sensitive Reinforcement Learning with the Optimized Certainty Equivalent (OCE) risk.
We propose two general meta-algorithms via reductions to standard RL.
We show that it learns the optimal risk-sensitive policy while prior algorithms provably fail.
arXiv Detail & Related papers (2024-03-10T21:45:12Z) - Deep Reinforcement Learning for Community Battery Scheduling under
Uncertainties of Load, PV Generation, and Energy Prices [5.694872363688119]
This paper presents a deep reinforcement learning (RL) strategy to schedule a community battery system in the presence of uncertainties.
We position the community battery to play a versatile role, in integrating local PV energy, reducing peak load, and exploiting energy price fluctuations for arbitrage.
arXiv Detail & Related papers (2023-12-04T13:45:17Z) - Risk-Controlling Model Selection via Guided Bayesian Optimization [35.53469358591976]
We find a configuration that adheres to user-specified limits on certain risks while being useful with respect to other conflicting metrics.
Our method identifies a set of optimal configurations residing in a designated region of interest.
We demonstrate the effectiveness of our approach on a range of tasks with multiple desiderata, including low error rates, equitable predictions, handling spurious correlations, managing rate and distortion in generative models, and reducing computational costs.
arXiv Detail & Related papers (2023-12-04T07:29:44Z) - Safe Deployment for Counterfactual Learning to Rank with Exposure-Based
Risk Minimization [63.93275508300137]
We introduce a novel risk-aware Counterfactual Learning To Rank method with theoretical guarantees for safe deployment.
Our experimental results demonstrate the efficacy of our proposed method, which is effective at avoiding initial periods of bad performance when little data is available.
arXiv Detail & Related papers (2023-04-26T15:54:23Z) - Efficient Risk-Averse Reinforcement Learning [79.61412643761034]
In risk-averse reinforcement learning (RL), the goal is to optimize some risk measure of the returns.
We prove that under certain conditions this inevitably leads to a local-optimum barrier, and propose a soft risk mechanism to bypass it.
We demonstrate improved risk aversion in maze navigation, autonomous driving, and resource allocation benchmarks.
arXiv Detail & Related papers (2022-05-10T19:40:52Z) - Monotonic Improvement Guarantees under Non-stationarity for
Decentralized PPO [66.5384483339413]
We present a new monotonic improvement guarantee for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL)
We show that a trust region constraint can be effectively enforced in a principled way by bounding independent ratios based on the number of agents in training.
arXiv Detail & Related papers (2022-01-31T20:39:48Z) - Off-policy Reinforcement Learning with Optimistic Exploration and
Distribution Correction [73.77593805292194]
We train a separate exploration policy to maximize an approximate upper confidence bound of the critics in an off-policy actor-critic framework.
To mitigate the off-policy-ness, we adapt the recently introduced DICE framework to learn a distribution correction ratio for off-policy actor-critic training.
arXiv Detail & Related papers (2021-10-22T22:07:51Z) - Adaptive Control and Regret Minimization in Linear Quadratic Gaussian
(LQG) Setting [91.43582419264763]
We propose LqgOpt, a novel reinforcement learning algorithm based on the principle of optimism in the face of uncertainty.
LqgOpt efficiently explores the system dynamics, estimates the model parameters up to their confidence interval, and deploys the controller of the most optimistic model.
arXiv Detail & Related papers (2020-03-12T19:56:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.