Distributional Reinforcement Learning for Scheduling of (Bio)chemical
Production Processes
- URL: http://arxiv.org/abs/2203.00636v1
- Date: Tue, 1 Mar 2022 17:25:40 GMT
- Title: Distributional Reinforcement Learning for Scheduling of (Bio)chemical
Production Processes
- Authors: Max Mowbray, Dongda Zhang, Ehecatl Antonio Del Rio Chanona
- Abstract summary: Reinforcement Learning (RL) has recently received significant attention from the process systems engineering and control communities.
We present a RL methodology to address precedence and disjunctive constraints as commonly imposed on production scheduling problems.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reinforcement Learning (RL) has recently received significant attention from
the process systems engineering and control communities. Recent works have
investigated the application of RL to identify optimal scheduling decision in
the presence of uncertainty. In this work, we present a RL methodology to
address precedence and disjunctive constraints as commonly imposed on
production scheduling problems. This work naturally enables the optimization of
risk-sensitive formulations such as the conditional value-at-risk (CVaR), which
are essential in realistic scheduling processes. The proposed strategy is
investigated thoroughly in a single-stage, parallel batch production
environment, and benchmarked against mixed integer linear programming (MILP)
strategies. We show that the policy identified by our approach is able to
account for plant uncertainties in online decision-making, with expected
performance comparable to existing MILP methods. Additionally, the framework
gains the benefits of optimizing for risk-sensitive measures, and identifies
decisions orders of magnitude faster than the most efficient optimization
approaches. This promises to mitigate practical issues and ease in handling
realizations of process uncertainty in the paradigm of online production
scheduling.
Related papers
- Theoretically Guaranteed Policy Improvement Distilled from Model-Based
Planning [64.10794426777493]
Model-based reinforcement learning (RL) has demonstrated remarkable successes on a range of continuous control tasks.
Recent practices tend to distill optimized action sequences into an RL policy during the training phase.
We develop an approach to distill from model-based planning to the policy.
arXiv Detail & Related papers (2023-07-24T16:52:31Z) - Stepsize Learning for Policy Gradient Methods in Contextual Markov
Decision Processes [35.889129338603446]
Policy-based algorithms are among the most widely adopted techniques in model-free RL.
They tend to struggle when asked to accomplish a series of heterogeneous tasks.
We introduce a new formulation, known as meta-MDP, that can be used to solve any hyper parameter selection problem in RL.
arXiv Detail & Related papers (2023-06-13T12:58:12Z) - Timing Process Interventions with Causal Inference and Reinforcement
Learning [2.919859121836811]
This paper presents experiments on timed process interventions with synthetic data that renders genuine online RL and the comparison to CI possible.
Our experiments reveal that RL's policies outperform those from CI and are more robust at the same time.
Unlike CI, the unaltered online RL approach can be applied to other, more generic PresPM problems such as next best activity recommendations.
arXiv Detail & Related papers (2023-06-07T10:02:16Z) - Making Linear MDPs Practical via Contrastive Representation Learning [101.75885788118131]
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
We consider an alternative definition of linear MDPs that automatically ensures normalization while allowing efficient representation learning.
We demonstrate superior performance over existing state-of-the-art model-based and model-free algorithms on several benchmarks.
arXiv Detail & Related papers (2022-07-14T18:18:02Z) - Towards Standardizing Reinforcement Learning Approaches for Stochastic
Production Scheduling [77.34726150561087]
reinforcement learning can be used to solve scheduling problems.
Existing studies rely on (sometimes) complex simulations for which the code is unavailable.
There is a vast array of RL designs to choose from.
standardization of model descriptions - both production setup and RL design - and validation scheme are a prerequisite.
arXiv Detail & Related papers (2021-04-16T16:07:10Z) - A Two-stage Framework and Reinforcement Learning-based Optimization
Algorithms for Complex Scheduling Problems [54.61091936472494]
We develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together.
The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively.
Results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems.
arXiv Detail & Related papers (2021-03-10T03:16:12Z) - Constrained Model-Free Reinforcement Learning for Process Optimization [0.0]
Reinforcement learning (RL) is a control approach that can handle nonlinear optimal control problems.
Despite the promise exhibited, RL has yet to see marked translation to industrial practice.
We propose an 'oracle'-assisted constrained Q-learning algorithm that guarantees the satisfaction of joint chance constraints with a high probability.
arXiv Detail & Related papers (2020-11-16T13:16:22Z) - Chance Constrained Policy Optimization for Process Control and
Optimization [1.4908563154226955]
Chemical process optimization and control are affected by 1) plant-model mismatch, 2) process disturbances, and 3) constraints for safe operation.
We propose a chance constrained policy optimization algorithm which guarantees the satisfaction of joint chance constraints with a high probability.
arXiv Detail & Related papers (2020-07-30T14:20:35Z) - Combining Deep Learning and Optimization for Security-Constrained
Optimal Power Flow [94.24763814458686]
Security-constrained optimal power flow (SCOPF) is fundamental in power systems.
Modeling of APR within the SCOPF problem results in complex large-scale mixed-integer programs.
This paper proposes a novel approach that combines deep learning and robust optimization techniques.
arXiv Detail & Related papers (2020-07-14T12:38:21Z) - Constrained Reinforcement Learning for Dynamic Optimization under
Uncertainty [1.5797349391370117]
Dynamic real-time optimization (DRTO) is a challenging task due to the fact that optimal operating conditions must be computed in real time.
The main bottleneck in the industrial application of DRTO is the presence of uncertainty.
We present a constrained reinforcement learning (RL) based approach to accommodate these difficulties.
arXiv Detail & Related papers (2020-06-04T10:17:35Z) - Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot
Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO)
We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.