Reducing Action Space: Reference-Model-Assisted Deep Reinforcement
Learning for Inverter-based Volt-Var Control
- URL: http://arxiv.org/abs/2210.07360v1
- Date: Mon, 10 Oct 2022 02:55:16 GMT
- Title: Reducing Action Space: Reference-Model-Assisted Deep Reinforcement
Learning for Inverter-based Volt-Var Control
- Authors: Qiong Liu, Ye Guo, Lirong Deng, Haotian Liu, Dongyu Li, Hongbin Sun
- Abstract summary: Reference-model-assisted deep reinforcement learning (DRL) for inverter-based Volt-Var Control (IB-VVC) in active distribution networks is proposed.
To reduce the action space of DRL, we design a reference-model-assisted DRL approach.
It reduces the learning difficulties of DRL and optimises the performance of the reference-model-assisted DRL approach.
- Score: 15.755809730271327
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reference-model-assisted deep reinforcement learning (DRL) for inverter-based
Volt-Var Control (IB-VVC) in active distribution networks is proposed. We
investigate that a large action space increases the learning difficulties of
DRL and degrades the optimization performance in the process of generating data
and training neural networks. To reduce the action space of DRL, we design a
reference-model-assisted DRL approach. We introduce definitions of the
reference model, reference-model-based optimization, and reference actions. The
reference-model-assisted DRL learns the residual actions between the reference
actions and optimal actions, rather than learning the optimal actions directly.
Since the residual actions are considerably smaller than the optimal actions
for a reference model, we can design a smaller action space for the
reference-model-assisted DRL. It reduces the learning difficulties of DRL and
optimises the performance of the reference-model-assisted DRL approach. It is
noteworthy that the reference-model-assisted DRL approach is compatible with
any policy gradient DRL algorithms for continuous action problems. This work
takes the soft actor-critic algorithm as an example and designs a
reference-model-assisted soft actor-critic algorithm. Simulations show that 1)
large action space degrades the performance of DRL in the whole training stage,
and 2) reference-model-assisted DRL requires fewer iteration times and returns
a better optimization performance.
Related papers
- The Impact of Quantization and Pruning on Deep Reinforcement Learning Models [1.5252729367921107]
Deep reinforcement learning (DRL) has achieved remarkable success across various domains, such as video games, robotics, and, recently, large language models.
However, the computational costs and memory requirements of DRL models often limit their deployment in resource-constrained environments.
Our study investigates the impact of two prominent compression methods, quantization and pruning on DRL models.
arXiv Detail & Related papers (2024-07-05T18:21:17Z) - Reflect-RL: Two-Player Online RL Fine-Tuning for LMs [38.5495318990769]
We propose Reflect-RL, a system to fine-tune language models (LMs) using online reinforcement learning (RL) and supervised fine-tuning (SFT)
Test results indicate GPT-2 XL 1.56B fine-tuned with Reflect-RL outperforms larger open-source LMs, such as Mistral 7B.
arXiv Detail & Related papers (2024-02-20T01:04:21Z) - Hybrid Reinforcement Learning for Optimizing Pump Sustainability in
Real-World Water Distribution Networks [55.591662978280894]
This article addresses the pump-scheduling optimization problem to enhance real-time control of real-world water distribution networks (WDNs)
Our primary objectives are to adhere to physical operational constraints while reducing energy consumption and operational costs.
Traditional optimization techniques, such as evolution-based and genetic algorithms, often fall short due to their lack of convergence guarantees.
arXiv Detail & Related papers (2023-10-13T21:26:16Z) - Provable Reward-Agnostic Preference-Based Reinforcement Learning [61.39541986848391]
Preference-based Reinforcement Learning (PbRL) is a paradigm in which an RL agent learns to optimize a task using pair-wise preference-based feedback over trajectories.
We propose a theoretical reward-agnostic PbRL framework where exploratory trajectories that enable accurate learning of hidden reward functions are acquired.
arXiv Detail & Related papers (2023-05-29T15:00:09Z) - Reinforcement Learning with Partial Parametric Model Knowledge [3.3598755777055374]
We adapt reinforcement learning methods for continuous control to bridge the gap between complete ignorance and perfect knowledge of the environment.
Our method, Partial Knowledge Least Squares Policy Iteration (PLSPI), takes inspiration from both model-free RL and model-based control.
arXiv Detail & Related papers (2023-04-26T01:04:35Z) - Learning a model is paramount for sample efficiency in reinforcement
learning control of PDEs [5.488334211013093]
We show that learning an actuated model in parallel to training the RL agent significantly reduces the total amount of required data sampled from the real system.
We also show that iteratively updating the model is of major importance to avoid biases in the RL training.
arXiv Detail & Related papers (2023-02-14T16:14:39Z) - Simplifying Model-based RL: Learning Representations, Latent-space
Models, and Policies with One Objective [142.36200080384145]
We propose a single objective which jointly optimize a latent-space model and policy to achieve high returns while remaining self-consistent.
We demonstrate that the resulting algorithm matches or improves the sample-efficiency of the best prior model-based and model-free RL methods.
arXiv Detail & Related papers (2022-09-18T03:51:58Z) - Pessimistic Model Selection for Offline Deep Reinforcement Learning [56.282483586473816]
Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications.
One main barrier is the over-fitting issue that leads to poor generalizability of the policy learned by DRL.
We propose a pessimistic model selection (PMS) approach for offline DRL with a theoretical guarantee.
arXiv Detail & Related papers (2021-11-29T06:29:49Z) - POAR: Efficient Policy Optimization via Online Abstract State
Representation Learning [6.171331561029968]
State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states.
We introduce a new SRL prior called domain resemblance to leverage expert demonstration to improve SRL interpretations.
We empirically verify POAR to efficiently handle tasks in high dimensions and facilitate training real-life robots directly from scratch.
arXiv Detail & Related papers (2021-09-17T16:52:03Z) - Behavioral Priors and Dynamics Models: Improving Performance and Domain
Transfer in Offline RL [82.93243616342275]
We introduce Offline Model-based RL with Adaptive Behavioral Priors (MABE)
MABE is based on the finding that dynamics models, which support within-domain generalization, and behavioral priors, which support cross-domain generalization, are complementary.
In experiments that require cross-domain generalization, we find that MABE outperforms prior methods.
arXiv Detail & Related papers (2021-06-16T20:48:49Z) - Learning to Reweight Imaginary Transitions for Model-Based Reinforcement
Learning [58.66067369294337]
When the model is inaccurate or biased, imaginary trajectories may be deleterious for training the action-value and policy functions.
We adaptively reweight the imaginary transitions, so as to reduce the negative effects of poorly generated trajectories.
Our method outperforms state-of-the-art model-based and model-free RL algorithms on multiple tasks.
arXiv Detail & Related papers (2021-04-09T03:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.