A Reinforcement Learning-based Economic Model Predictive Control
Framework for Autonomous Operation of Chemical Reactors
- URL: http://arxiv.org/abs/2105.02656v1
- Date: Thu, 6 May 2021 13:34:30 GMT
- Title: A Reinforcement Learning-based Economic Model Predictive Control
Framework for Autonomous Operation of Chemical Reactors
- Authors: Khalid Alhazmi, Fahad Albalawi, and S. Mani Sarathy
- Abstract summary: This work presents a novel framework for integrating EMPC and RL for online model parameter estimation of a class of nonlinear systems.
The major advantage of this framework is its simplicity; state-of-the-art RL algorithms and EMPC schemes can be employed with minimal modifications.
- Score: 0.5735035463793008
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Economic model predictive control (EMPC) is a promising methodology for
optimal operation of dynamical processes that has been shown to improve process
economics considerably. However, EMPC performance relies heavily on the
accuracy of the process model used. As an alternative to model-based control
strategies, reinforcement learning (RL) has been investigated as a model-free
control methodology, but issues regarding its safety and stability remain an
open research challenge. This work presents a novel framework for integrating
EMPC and RL for online model parameter estimation of a class of nonlinear
systems. In this framework, EMPC optimally operates the closed loop system
while maintaining closed loop stability and recursive feasibility. At the same
time, to optimize the process, the RL agent continuously compares the measured
state of the process with the model's predictions (nominal states), and
modifies model parameters accordingly. The major advantage of this framework is
its simplicity; state-of-the-art RL algorithms and EMPC schemes can be employed
with minimal modifications. The performance of the proposed framework is
illustrated on a network of reactions with challenging dynamics and practical
significance. This framework allows control, optimization, and model correction
to be performed online and continuously, making autonomous reactor operation
more attainable.
Related papers
- Task-optimal data-driven surrogate models for eNMPC via differentiable simulation and optimization [42.72938925647165]
We present a method for end-to-end learning of Koopman surrogate models for optimal performance in a specific control task.
We use a training algorithm that exploits the potential differentiability of environments based on mechanistic simulation models to aid the policy optimization.
arXiv Detail & Related papers (2024-03-21T14:28:43Z) - MOTO: Offline Pre-training to Online Fine-tuning for Model-based Robot
Learning [52.101643259906915]
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations.
Existing model-based offline RL methods are not suitable for offline-to-online fine-tuning in high-dimensional domains.
We propose an on-policy model-based method that can efficiently reuse prior data through model-based value expansion and policy regularization.
arXiv Detail & Related papers (2024-01-06T21:04:31Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Online Policy Optimization for Robust MDP [17.995448897675068]
Reinforcement learning (RL) has exceeded human performance in many synthetic settings such as video games and Go.
In this work, we consider online robust Markov decision process (MDP) by interacting with an unknown nominal system.
We propose a robust optimistic policy optimization algorithm that is provably efficient.
arXiv Detail & Related papers (2022-09-28T05:18:20Z) - On Effective Scheduling of Model-based Reinforcement Learning [53.027698625496015]
We propose a framework named AutoMBPO to automatically schedule the real data ratio.
In this paper, we first theoretically analyze the role of real data in policy training, which suggests that gradually increasing the ratio of real data yields better performance.
arXiv Detail & Related papers (2021-11-16T15:24:59Z) - Uncertainty-Aware Model-Based Reinforcement Learning with Application to
Autonomous Driving [2.3303341607459687]
We propose a novel uncertainty-aware model-based reinforcement learning framework, and then implement and validate it in autonomous driving.
The framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model.
The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios.
arXiv Detail & Related papers (2021-06-23T06:55:14Z) - COMBO: Conservative Offline Model-Based Policy Optimization [120.55713363569845]
Uncertainty estimation with complex models, such as deep neural networks, can be difficult and unreliable.
We develop a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-actions.
We find that COMBO consistently performs as well or better as compared to prior offline model-free and model-based methods.
arXiv Detail & Related papers (2021-02-16T18:50:32Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.