Related papers: Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids

Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids

URL: http://arxiv.org/abs/2409.11267v1
Date: Tue, 17 Sep 2024 15:17:16 GMT
Title: Integrating Reinforcement Learning and Model Predictive Control with Applications to Microgrids
Authors: Caio Fabio Oliveira da Silva, Azita Dabiri, Bart De Schutter,
Abstract summary: This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to solve optimal control problems in mixed-logical dynamical systems. The proposed method significantly reduces the online computation time of the MPC approach and that it generates policies with small optimality gaps and high feasibility rates.
Score: 14.389086937116582
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work proposes an approach that integrates reinforcement learning and model predictive control (MPC) to efficiently solve finite-horizon optimal control problems in mixed-logical dynamical systems. Optimization-based control of such systems with discrete and continuous decision variables entails the online solution of mixed-integer quadratic or linear programs, which suffer from the curse of dimensionality. Our approach aims at mitigating this issue by effectively decoupling the decision on the discrete variables and the decision on the continuous variables. Moreover, to mitigate the combinatorial growth in the number of possible actions due to the prediction horizon, we conceive the definition of decoupled Q-functions to make the learning problem more tractable. The use of reinforcement learning reduces the online optimization problem of the MPC controller from a mixed-integer linear (quadratic) program to a linear (quadratic) program, greatly reducing the computational time. Simulation experiments for a microgrid, based on real-world data, demonstrate that the proposed method significantly reduces the online computation time of the MPC approach and that it generates policies with small optimality gaps and high feasibility rates.

Related papers

Intersection of Reinforcement Learning and Bayesian Optimization for Intelligent Control of Industrial Processes: A Safe MPC-based DPG using Multi-Objective BO [0.0]
Model Predictive Control (MPC)-based Reinforcement Learning (RL) offers a structured and interpretable alternative to Deep Neural Network (DNN)-based RL methods.<n>Standard MPC-RL approaches often suffer from slow convergence, suboptimal policy learning due to limited parameterization, and safety issues during online adaptation.<n>We propose a novel framework that integrates MPC-RL with Multi-Objective Bayesian Optimization (MOBO)
arXiv Detail & Related papers (2025-07-14T02:31:52Z)
A Guaranteed-Stable Neural Network Approach for Optimal Control of Nonlinear Systems [3.5000297213981653]
A promising approach to optimal control of nonlinear systems involves iteratively linearizing the system and solving an optimization problem at each time instant to determine the optimal control input. Since this approach relies on online optimization, it can be computationally expensive, and thus unrealistic for systems with limited computing resources. One potential solution to this issue is to incorporate a Neural Network (NN) into the control loop.
arXiv Detail & Related papers (2025-01-28T22:55:47Z)
Towards An Unsupervised Learning Scheme for Efficiently Solving Parameterized Mixed-Integer Programs [6.1860817947800655]
We train an autoencoder for binary variables in an unsupervised learning fashion. We present a strategy to construct a class of cutting plane constraints from the decoder parameters of an offline-trained AE. Their integration into the primal MIP problem leads to a tightened MIP with the reduced feasible region.
arXiv Detail & Related papers (2024-12-23T14:48:32Z)
Two-Stage ML-Guided Decision Rules for Sequential Decision Making under Uncertainty [55.06411438416805]
Sequential Decision Making under Uncertainty (SDMU) is ubiquitous in many domains such as energy, finance, and supply chains. Some SDMU are naturally modeled as Multistage Problems (MSPs) but the resulting optimizations are notoriously challenging from a computational standpoint. This paper introduces a novel approach Two-Stage General Decision Rules (TS-GDR) to generalize the policy space beyond linear functions. The effectiveness of TS-GDR is demonstrated through an instantiation using Deep Recurrent Neural Networks named Two-Stage Deep Decision Rules (TS-LDR)
arXiv Detail & Related papers (2024-05-23T18:19:47Z)
Efficient model predictive control for nonlinear systems modelled by deep neural networks [6.5268245109828005]
This paper presents a model predictive control (MPC) for dynamic systems whose nonlinearity and uncertainty are modelled by deep neural networks (NNs) Since the NN output contains a high-order complex nonlinearity of the system state and control input, the MPC problem is nonlinear and challenging to solve for real-time control.
arXiv Detail & Related papers (2024-05-16T18:05:18Z)
Parameter-Adaptive Approximate MPC: Tuning Neural-Network Controllers without Retraining [50.00291020618743]
This work introduces a novel, parameter-adaptive AMPC architecture capable of online tuning without recomputing large datasets and retraining. We showcase the effectiveness of parameter-adaptive AMPC by controlling the swing-ups of two different real cartpole systems with a severely resource-constrained microcontroller (MCU) Taken together, these contributions represent a marked step toward the practical application of AMPC in real-world systems.
arXiv Detail & Related papers (2024-04-08T20:02:19Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Reinforcement Learning Methods for Wordle: A POMDP/Adaptive Control Approach [0.3093890460224435]
We address the solution of the popular Wordle puzzle, using new reinforcement learning methods. For the Wordle puzzle, they yield on-line solution strategies that are very close to optimal at relatively modest computational cost.
arXiv Detail & Related papers (2022-11-15T03:46:41Z)
Multi-Objective Policy Gradients with Topological Constraints [108.10241442630289]
We present a new algorithm for a policy gradient in TMDPs by a simple extension of the proximal policy optimization (PPO) algorithm. We demonstrate this on a real-world multiple-objective navigation problem with an arbitrary ordering of objectives both in simulation and on a real robot.
arXiv Detail & Related papers (2022-09-15T07:22:58Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
Accelerating Federated Edge Learning via Topology Optimization [41.830942005165625]
Federated edge learning (FEEL) is envisioned as a promising paradigm to achieve privacy-preserving distributed learning. It consumes excessive learning time due to the existence of straggler devices. A novel topology-optimized federated edge learning (TOFEL) scheme is proposed to tackle the heterogeneity issue in federated learning.
arXiv Detail & Related papers (2022-04-01T14:49:55Z)
Neural Predictive Control for the Optimization of Smart Grid Flexibility Schedules [0.0]
Model predictive control (MPC) is a method to formulate the optimal scheduling problem for grid flexibilities in a mathematical manner. MPC methods promise accurate results for time-constrained grid optimization but they are inherently limited by the calculation time needed for large and complex power system models. A Neural Predictive Control scheme is proposed to learn optimal control policies for linear and nonlinear power systems through imitation.
arXiv Detail & Related papers (2021-08-19T15:12:35Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Reinforcement Learning for Adaptive Mesh Refinement [63.7867809197671]
We propose a novel formulation of AMR as a Markov decision process and apply deep reinforcement learning to train refinement policies directly from simulation. The model sizes of these policy architectures are independent of the mesh size and hence scale to arbitrarily large and complex simulations.
arXiv Detail & Related papers (2021-03-01T22:55:48Z)
Non-stationary Online Learning with Memory and Non-stochastic Control [71.14503310914799]
We study the problem of Online Convex Optimization (OCO) with memory, which allows loss functions to depend on past decisions. In this paper, we introduce dynamic policy regret as the performance measure to design algorithms robust to non-stationary environments. We propose a novel algorithm for OCO with memory that provably enjoys an optimal dynamic policy regret in terms of time horizon, non-stationarity measure, and memory length.
arXiv Detail & Related papers (2021-02-07T09:45:15Z)
Average Cost Optimal Control of Stochastic Systems Using Reinforcement Learning [0.19036571490366497]
We propose an online learning scheme to estimate the kernel matrix of Q-function. The obtained control gain and kernel matrix are proved to converge to the optimal ones.
arXiv Detail & Related papers (2020-10-13T08:51:06Z)
Combining Deep Learning and Optimization for Security-Constrained Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems. Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs. This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.