Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids
- URL: http://arxiv.org/abs/2409.11267v1
- Date: Tue, 17 Sep 2024 15:17:16 GMT
- Title: Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids
- Authors: Caio Fabio Oliveira da Silva, Azita Dabiri, Bart De Schutter,
- Abstract summary: This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to solve optimal control problems in mixed-logical dynamical systems.
The proposed method significantly reduces the online computation time of the MPC approach and that it generates policies with small optimality gaps and high feasibility rates.
- Score: 14.389086937116582
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to efficiently solve finite-horizon optimal control problems in mixed-logical dynamical systems. Optimization-based control of such systems with discrete and continuous decision variables entails the online solution of mixed-integer quadratic or linear programs, which suffer from the curse of dimensionality. Our approach aims at mitigating this issue by effectively decoupling the decision on the discrete variables and the decision on the continuous variables. Moreover, to mitigate the combinatorial growth in the number of possible actions due to the prediction horizon, we conceive the definition of decoupled Q-functions to make the learning problem more tractable. The use of reinforcement learning reduces the online optimization problem of the MPC controller from a mixed-integer linear (quadratic) program to a linear (quadratic) program, greatly reducing the computational time. Simulation experiments for a microgrid, based on real-world data, demonstrate that the proposed method significantly reduces the online computation time of the MPC approach and that it generates policies with small optimality gaps and high feasibility rates.
Related papers
- Predictive Lagrangian Optimization for Constrained Reinforcement Learning [15.082498910832529]
Constrained optimization is popularly seen in reinforcement learning for addressing complex control tasks.
In this paper, we propose a more generic equivalence framework to build the connection between constrained optimization and feedback control system.
arXiv Detail & Related papers (2025-01-25T13:39:45Z) - Towards An Unsupervised Learning Scheme for Efficiently Solving Parameterized Mixed-Integer Programs [6.1860817947800655]
We train an autoencoder for binary variables in an unsupervised learning fashion.
We present a strategy to construct a class of cutting plane constraints from the decoder parameters of an offline-trained AE.
Their integration into the primal MIP problem leads to a tightened MIP with the reduced feasible region.
arXiv Detail & Related papers (2024-12-23T14:48:32Z) - Efficiently Training Deep-Learning Parametric Policies using Lagrangian Duality [55.06411438416805]
Constrained Markov Decision Processes (CMDPs) are critical in many high-stakes applications.
This paper introduces a novel approach, Two-Stage Deep Decision Rules (TS- DDR) to efficiently train parametric actor policies.
It is shown to enhance solution quality and to reduce computation times by several orders of magnitude when compared to current state-of-the-art methods.
arXiv Detail & Related papers (2024-05-23T18:19:47Z) - Efficient model predictive control for nonlinear systems modelled by deep neural networks [6.5268245109828005]
This paper presents a model predictive control (MPC) for dynamic systems whose nonlinearity and uncertainty are modelled by deep neural networks (NNs)
Since the NN output contains a high-order complex nonlinearity of the system state and control input, the MPC problem is nonlinear and challenging to solve for real-time control.
arXiv Detail & Related papers (2024-05-16T18:05:18Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control
Approach [0.3093890460224435]
We address the solution of the popular Wordle puzzle, using new reinforcement learning methods.
For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost.
arXiv Detail & Related papers (2022-11-15T03:46:41Z) - Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm.
We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z) - Neural Predictive Control for the Optimization of Smart Grid Flexibility
Schedules [0.0]
Model predictive control (MPC) is a method to formulate the optimal scheduling problem for grid flexibilities in a mathematical manner.
MPC methods promise accurate results for time-constrained grid optimization but they are inherently limited by the calculation time needed for large and complex power system models.
A Neural Predictive Control scheme is proposed to learn optimal control policies for linear and nonlinear power systems through imitation.
arXiv Detail & Related papers (2021-08-19T15:12:35Z) - Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation.
The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z) - Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions.
In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments.
We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.