Related papers: Optimized cost function for demand response coordination of multiple EV charging stations using reinforcement learning

Optimized cost function for demand response coordination of multiple EV charging stations using reinforcement learning

URL: http://arxiv.org/abs/2203.01654v1
Date: Thu, 3 Mar 2022 11:22:27 GMT
Title: Optimized cost function for demand response coordination of multiple EV charging stations using reinforcement learning
Authors: Manu Lahariya, Nasrin Sadeghianpourhamami and Chris Develder
Abstract summary: We build on previous research on RL, based on a Markov decision process (MDP) to simultaneously coordinate multiple charging stations. We propose an improved cost function that essentially forces the learned control policy to always fulfill any charging demand that does not offer flexibility. We rigorously compare the newly proposed batch RL fitted Q-iteration implementation with the original (costly) one, using real-world data.
Score: 6.37470346908743
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Electric vehicle (EV) charging stations represent a substantial load with significant flexibility. The exploitation of that flexibility in demand response (DR) algorithms becomes increasingly important to manage and balance demand and supply in power grids. Model-free DR based on reinforcement learning (RL) is an attractive approach to balance such EV charging load. We build on previous research on RL, based on a Markov decision process (MDP) to simultaneously coordinate multiple charging stations. However, we note that the computationally expensive cost function adopted in the previous research leads to large training times, which limits the feasibility and practicality of the approach. We, therefore, propose an improved cost function that essentially forces the learned control policy to always fulfill any charging demand that does not offer any flexibility. We rigorously compare the newly proposed batch RL fitted Q-iteration implementation with the original (costly) one, using real-world data. Specifically, for the case of load flattening, we compare the two approaches in terms of (i) the processing time to learn the RL-based charging policy, as well as (ii) the overall performance of the policy decisions in terms of meeting the target load for unseen test data. The performance is analyzed for different training periods and varying training sample sizes. In addition to both RL policies performance results, we provide performance bounds in terms of both (i) an optimal all-knowing strategy, and (ii) a simple heuristic spreading individual EV charging uniformly over time

Related papers

CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models [68.26281707780761]
This paper introduces Completion Pruning Policy Optimization (CPPO) to accelerate the training of reasoning models. We show that CPPO achieves up to $8.32times$ speedup on GSM8K and $3.51times$ on Math while preserving or even enhancing the accuracy compared to the original GRPO.
arXiv Detail & Related papers (2025-03-28T11:30:05Z)
Reinforcement Learning-based Approach for Vehicle-to-Building Charging with Heterogeneous Agents and Long Term Rewards [3.867907469895697]
We introduce a novel RL framework that combines the Deep Deterministic Policy Gradient approach with action masking and efficient MILP-driven policy guidance. Our approach balances the exploration of continuous action spaces to meet user charging demands. Our results show that the proposed approach is one of the first scalable and general approaches to solving the V2B energy management challenge.
arXiv Detail & Related papers (2025-02-24T19:24:41Z)
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone [72.17534881026995]
We develop an offline and online fine-tuning approach called policy-agnostic RL (PA-RL) We show the first result that successfully fine-tunes OpenVLA, a 7B generalist robot policy, autonomously with Cal-QL, an online RL fine-tuning algorithm.
arXiv Detail & Related papers (2024-12-09T17:28:03Z)
Cost-Sensitive Multi-Fidelity Bayesian Optimization with Transfer of Learning Curve Extrapolation [55.75188191403343]
We introduce utility, which is a function predefined by each user and describes the trade-off between cost and performance of BO. We validate our algorithm on various LC datasets and found it outperform all the previous multi-fidelity BO and transfer-BO baselines we consider.
arXiv Detail & Related papers (2024-05-28T07:38:39Z)
Stochastic Q-learning for Large Discrete Action Spaces [79.1700188160944]
In complex environments with discrete action spaces, effective decision-making is critical in reinforcement learning (RL) We present value-based RL approaches which, as opposed to optimizing over the entire set of $n$ actions, only consider a variable set of actions, possibly as small as $mathcalO(log(n)$)$. The presented value-based RL methods include, among others, Q-learning, StochDQN, StochDDQN, all of which integrate this approach for both value-function updates and action selection.
arXiv Detail & Related papers (2024-05-16T17:58:44Z)
Learning and Optimization for Price-based Demand Response of Electric Vehicle Charging [0.9124662097191375]
We propose a new decision-focused end-to-end framework for PBDR modeling. We evaluate the effectiveness of our method on a simulation of charging station operation with synthetic PBDR patterns of EV customers.
arXiv Detail & Related papers (2024-04-16T06:39:30Z)
Train Once, Get a Family: State-Adaptive Balances for Offline-to-Online Reinforcement Learning [71.02384943570372]
Family Offline-to-Online RL (FamO2O) is a framework that empowers existing algorithms to determine state-adaptive improvement-constraint balances. FamO2O offers a statistically significant improvement over various existing methods, achieving state-of-the-art performance on the D4RL benchmark.
arXiv Detail & Related papers (2023-10-27T08:30:54Z)
Hybrid Reinforcement Learning for Optimizing Pump Sustainability in Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs) Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs. Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z)
Combined Peak Reduction and Self-Consumption Using Proximal Policy Optimization [0.2867517731896504]
Residential demand response programs aim to activate demand flexibility at the household level. New RL algorithms, such as proximal policy optimisation (PPO), have tried to increase data efficiency. We show our adapted version of PPO combined transfer learning, reduces cost by 14.51% compared to a regular controller.
arXiv Detail & Related papers (2022-11-27T13:53:52Z)
Computationally efficient joint coordination of multiple electric vehicle charging points using reinforcement learning [6.37470346908743]
A major challenge in todays power grid is to manage the increasing load from electric vehicle (EV) charging. We propose a single-step solution that jointly coordinates multiple charging points at once. We show that our new RL solutions still improve the performance of charging demand coordination by 40-50% compared to a business-as-usual policy.
arXiv Detail & Related papers (2022-03-26T13:42:57Z)
An Experimental Design Perspective on Model-Based Reinforcement Learning [73.37942845983417]
In practical applications of RL, it is expensive to observe state transitions from the environment. We propose an acquisition function that quantifies how much information a state-action pair would provide about the optimal solution to a Markov decision process.
arXiv Detail & Related papers (2021-12-09T23:13:57Z)
On Effective Scheduling of Model-based Reinforcement Learning [53.027698625496015]
We propose a framework named AutoMBPO to automatically schedule the real data ratio. In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance.
arXiv Detail & Related papers (2021-11-16T15:24:59Z)
Learning to Operate an Electric Vehicle Charging Station Considering Vehicle-grid Integration [4.855689194518905]
We propose a novel centralized allocation and decentralized execution (CADE) reinforcement learning (RL) framework to maximize the charging station's profit. In the centralized allocation process, EVs are allocated to either the waiting or charging spots. In the decentralized execution process, each charger makes its own charging/discharging decision while learning the action-value functions from a shared replay memory. Numerical results show that the proposed CADE framework is both computationally efficient and scalable, and significantly outperforms the baseline model predictive control (MPC)
arXiv Detail & Related papers (2021-11-01T23:10:28Z)
Efficient Representation for Electric Vehicle Charging Station Operations using Reinforcement Learning [5.815007821143811]
We develop aggregation schemes that are based on the emergency of EV charging, namely the laxity value. A least-laxity first (LLF) rule is adopted to consider only the total charging power of the EVCS. In addition, we propose an equivalent state aggregation that can guarantee to attain the same optimal policy.
arXiv Detail & Related papers (2021-08-07T00:34:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.