Related papers: Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and Control

Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and Control

URL: http://arxiv.org/abs/2006.09919v1
Date: Wed, 17 Jun 2020 14:59:13 GMT
Title: Green Simulation Assisted Reinforcement Learning with Model Risk for Biomanufacturing Learning and Control
Authors: Hua Zheng, Wei Xie and Mingbin Ben Feng
Abstract summary: Biopharmaceutical manufacturing faces critical challenges, including complexity, high variability, lengthy lead time, and limited historical data and knowledge of the underlying system process. To address these challenges, we propose a green simulation assisted model-based reinforcement learning to support process online learning and guide dynamic decision making.
Score: 3.0657293044976894
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Biopharmaceutical manufacturing faces critical challenges, including complexity, high variability, lengthy lead time, and limited historical data and knowledge of the underlying system stochastic process. To address these challenges, we propose a green simulation assisted model-based reinforcement learning to support process online learning and guide dynamic decision making. Basically, the process model risk is quantified by the posterior distribution. At any given policy, we predict the expected system response with prediction risk accounting for both inherent stochastic uncertainty and model risk. Then, we propose green simulation assisted reinforcement learning and derive the mixture proposal distribution of decision process and likelihood ratio based metamodel for the policy gradient, which can selectively reuse process trajectory outputs collected from previous experiments to increase the simulation data-efficiency, improve the policy gradient estimation accuracy, and speed up the search for the optimal policy. Our numerical study indicates that the proposed approach demonstrates the promising performance.

Related papers

ActivePusher: Active Learning and Planning with Residual Physics for Nonprehensile Manipulation [2.7405276609125164]
Planning with learned dynamics models offers a promising approach toward real-world, long-horizon manipulation.<n>ActivePusher is a framework that combines residual-physics modeling with kernel-based uncertainty-driven active learning.<n>We evaluate our approach in both simulation and real-world environments and demonstrate that it improves data efficiency and planning success rates compared to baseline methods.
arXiv Detail & Related papers (2025-06-05T05:28:14Z)
Look Before Leap: Look-Ahead Planning with Uncertainty in Reinforcement Learning [4.902161835372679]
We propose a novel framework for uncertainty-aware policy optimization with model-based exploratory planning. In the policy optimization phase, we leverage an uncertainty-driven exploratory policy to actively collect diverse training samples. Our approach offers flexibility and applicability to tasks with varying state/action spaces and reward structures.
arXiv Detail & Related papers (2025-03-26T01:07:35Z)
Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach [2.4427666827706074]
This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy. The effectiveness of the proposed framework is validated using two autonomous driving datasets collected from human demonstrations.
arXiv Detail & Related papers (2024-11-12T22:56:28Z)
Risk-Sensitive Stochastic Optimal Control as Rao-Blackwellized Markovian Score Climbing [3.9410617513331863]
optimal control of dynamical systems is a crucial challenge in sequential decision-making. Control-as-inference approaches have had considerable success, providing a viable risk-sensitive framework to address the exploration-exploitation dilemma. This paper introduces a novel perspective by framing risk-sensitive control as Markovian reinforcement score climbing under samples drawn from a conditional particle filter.
arXiv Detail & Related papers (2023-12-21T16:34:03Z)
Mind the Uncertainty: Risk-Aware and Actively Exploring Model-Based Reinforcement Learning [26.497229327357935]
We introduce a simple but effective method for managing risk in model-based reinforcement learning with trajectory sampling. Experiments indicate that the separation of uncertainties is essential to performing well with data-driven approaches in uncertain and safety-critical control environments.
arXiv Detail & Related papers (2023-09-11T16:10:58Z)
Distributionally Robust Model-based Reinforcement Learning with Large State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment. We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets. We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z)
Predicting Hurricane Evacuation Decisions with Interpretable Machine Learning Models [0.0]
This study proposes a new methodology for predicting households' evacuation decisions constructed by easily accessible demographic and resource-related predictors. The proposed methodology could provide a new tool and framework for the emergency management authorities to improve the estimation of evacuation traffic demands.
arXiv Detail & Related papers (2023-03-12T03:45:44Z)
Risk-Sensitive Reinforcement Learning with Exponential Criteria [0.0]
We provide a definition of robust reinforcement learning policies and formulate a risk-sensitive reinforcement learning problem to approximate them. We introduce a novel online Actor-Critic algorithm based on solving a multiplicative Bellman equation using approximation updates. The implementation, performance, and robustness properties of the proposed methods are evaluated in simulated experiments.
arXiv Detail & Related papers (2022-12-18T04:44:38Z)
Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning. We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle. In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z)
Probabilistic robust linear quadratic regulators with Gaussian processes [73.0364959221845]
Probabilistic models such as Gaussian processes (GPs) are powerful tools to learn unknown dynamical systems from data for subsequent use in control design. We present a novel controller synthesis for linearized GP dynamics that yields robust controllers with respect to a probabilistic stability margin.
arXiv Detail & Related papers (2021-05-17T08:36:18Z)
Reliable Off-policy Evaluation for Reinforcement Learning [53.486680020852724]
In a sequential decision-making problem, off-policy evaluation estimates the expected cumulative reward of a target policy. We propose a novel framework that provides robust and optimistic cumulative reward estimates using one or multiple logged data.
arXiv Detail & Related papers (2020-11-08T23:16:19Z)
On the model-based stochastic value gradient for continuous reinforcement learning [50.085645237597056]
We show that simple model-based agents can outperform state-of-the-art model-free agents in terms of both sample-efficiency and final reward. Our findings suggest that model-based policy evaluation deserves closer attention.
arXiv Detail & Related papers (2020-08-28T17:58:29Z)
SAMBA: Safe Model-Based & Active Reinforcement Learning [59.01424351231993]
SAMBA is a framework for safe reinforcement learning that combines aspects from probabilistic modelling, information theory, and statistics. We evaluate our algorithm on a variety of safe dynamical system benchmarks involving both low and high-dimensional state representations. We provide intuition as to the effectiveness of the framework by a detailed analysis of our active metrics and safety constraints.
arXiv Detail & Related papers (2020-06-12T10:40:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.