Model-predictive control and reinforcement learning in multi-energy
system case studies
- URL: http://arxiv.org/abs/2104.09785v1
- Date: Tue, 20 Apr 2021 06:51:50 GMT
- Title: Model-predictive control and reinforcement learning in multi-energy
system case studies
- Authors: Glenn Ceusters, Rom\'an Cant\'u Rodr\'iguez, Alberte Bouso Garc\'ia,
R\"udiger Franke, Geert Deconinck, Lieve Helsen, Ann Now\'e, Maarten
Messagie, Luis Ramirez Camargo
- Abstract summary: We present an on-objective and off-policy multi- reinforcement learning (RL) approach against a linear model-predictive-control (LMPC)
We show that a twin delayed deep deterministic policy gradient (TD3) RL agent offers potential to match and outperform the perfect foresight LMPC benchmark (101.5%)
While in a more complex MES system configuration, the RL agent's performance is generally lower (94.6%), yet still better than the realistic LMPC (88.9%)
- Score: 0.2810625954925815
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Model-predictive-control (MPC) offers an optimal control technique to
establish and ensure that the total operation cost of multi-energy systems
remains at a minimum while fulfilling all system constraints. However, this
method presumes an adequate model of the underlying system dynamics, which is
prone to modelling errors and is not necessarily adaptive. This has an
associated initial and ongoing project-specific engineering cost. In this
paper, we present an on- and off-policy multi-objective reinforcement learning
(RL) approach, that does not assume a model a priori, benchmarking this against
a linear MPC (LMPC - to reflect current practice, though non-linear MPC
performs better) - both derived from the general optimal control problem,
highlighting their differences and similarities. In a simple multi-energy
system (MES) configuration case study, we show that a twin delayed deep
deterministic policy gradient (TD3) RL agent offers potential to match and
outperform the perfect foresight LMPC benchmark (101.5%). This while the
realistic LMPC, i.e. imperfect predictions, only achieves 98%. While in a more
complex MES system configuration, the RL agent's performance is generally lower
(94.6%), yet still better than the realistic LMPC (88.9%). In both case
studies, the RL agents outperformed the realistic LMPC after a training period
of 2 years using quarterly interactions with the environment. We conclude that
reinforcement learning is a viable optimal control technique for multi-energy
systems given adequate constraint handling and pre-training, to avoid unsafe
interactions and long training periods, as is proposed in fundamental future
work.
Related papers
- Comparison of Model Predictive Control and Proximal Policy Optimization for a 1-DOF Helicopter System [0.7499722271664147]
This study conducts a comparative analysis of Model Predictive Control (MPC) and Proximal Policy Optimization (PPO), a Deep Reinforcement Learning (DRL) algorithm, applied to a Quanser Aero 2 system.
PPO excels in rise-time and adaptability, making it a promising approach for applications requiring rapid response and adaptability.
arXiv Detail & Related papers (2024-08-28T08:35:34Z) - Parameter-Adaptive Approximate MPC: Tuning Neural-Network Controllers without Retraining [50.00291020618743]
This work introduces a novel, parameter-adaptive AMPC architecture capable of online tuning without recomputing large datasets and retraining.
We showcase the effectiveness of parameter-adaptive AMPC by controlling the swing-ups of two different real cartpole systems with a severely resource-constrained microcontroller (MCU)
Taken together, these contributions represent a marked step toward the practical application of AMPC in real-world systems.
arXiv Detail & Related papers (2024-04-08T20:02:19Z) - Efficient Learning of Voltage Control Strategies via Model-based Deep
Reinforcement Learning [9.936452412191326]
This article proposes a model-based deep reinforcement learning (DRL) method to design emergency control strategies for short-term voltage stability problems in power systems.
Recent advances show promising results in model-free DRL-based methods for power systems, but model-free methods suffer from poor sample efficiency and training time.
We propose a novel model-based-DRL framework where a deep neural network (DNN)-based dynamic surrogate model is utilized with the policy learning framework.
arXiv Detail & Related papers (2022-12-06T02:50:53Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z) - Evaluating model-based planning and planner amortization for continuous
control [79.49319308600228]
We take a hybrid approach, combining model predictive control (MPC) with a learned model and model-free policy learning.
We find that well-tuned model-free agents are strong baselines even for high DoF control problems.
We show that it is possible to distil a model-based planner into a policy that amortizes the planning without any loss of performance.
arXiv Detail & Related papers (2021-10-07T12:00:40Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Imitation Learning from MPC for Quadrupedal Multi-Gait Control [63.617157490920505]
We present a learning algorithm for training a single policy that imitates multiple gaits of a walking robot.
We use and extend MPC-Net, which is an Imitation Learning approach guided by Model Predictive Control.
We validate our approach on hardware and show that a single learned policy can replace its teacher to control multiple gaits.
arXiv Detail & Related papers (2021-03-26T08:48:53Z) - Blending MPC & Value Function Approximation for Efficient Reinforcement
Learning [42.429730406277315]
Model-Predictive Control (MPC) is a powerful tool for controlling complex, real-world systems.
We present a framework for improving on MPC with model-free reinforcement learning (RL)
We show that our approach can obtain performance comparable with MPC with access to true dynamics.
arXiv Detail & Related papers (2020-12-10T11:32:01Z) - ABC-LMPC: Safe Sample-Based Learning MPC for Stochastic Nonlinear
Dynamical Systems with Adjustable Boundary Conditions [34.44010424789202]
We present a novel LMPC algorithm, Adjustable Boundary LMPC (ABC-LMPC), which enables rapid adaptation to novel start and goal configurations.
We experimentally demonstrate that the resulting controller adapts to a variety of initial and terminal conditions on 3 continuous control tasks.
arXiv Detail & Related papers (2020-03-03T09:48:22Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.