RAPTOR-GEN: RApid PosTeriOR GENerator for Bayesian Learning in Biomanufacturing
- URL: http://arxiv.org/abs/2509.20753v2
- Date: Fri, 24 Oct 2025 04:03:49 GMT
- Title: RAPTOR-GEN: RApid PosTeriOR GENerator for Bayesian Learning in Biomanufacturing
- Authors: Wandi Xu, Wei Xie,
- Abstract summary: We introduce RApid PosTeriOR GENerator (RAPTOR-GEN), a mechanism-informed Bayesian learning framework.<n>RAPTOR-GEN is designed to accelerate intelligent digital twin development from sparse and heterogeneous experimental data.<n>We develop a fast and robust RAPTOR-GEN algorithm with controllable error.
- Score: 2.918639959397167
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Biopharmaceutical manufacturing is vital to public health but lacks the agility for rapid, on-demand production of biotherapeutics due to the complexity and variability of bioprocesses. To overcome this, we introduce RApid PosTeriOR GENerator (RAPTOR-GEN), a mechanism-informed Bayesian learning framework designed to accelerate intelligent digital twin development from sparse and heterogeneous experimental data. This framework is built on a multi-scale probabilistic knowledge graph (pKG), formulated as a stochastic differential equation (SDE)-based foundational model that captures the nonlinear dynamics of bioprocesses. RAPTOR-GEN consists of two ingredients: (i) an interpretable metamodel integrating linear noise approximation (LNA) that exploits the structural information of bioprocessing mechanisms and a sequential learning strategy to fuse heterogeneous and sparse data, enabling inference of latent state variables and explicit approximation of the intractable likelihood function; and (ii) an efficient Bayesian posterior sampling method that utilizes Langevin diffusion (LD) to accelerate posterior exploration by exploiting the gradients of the derived likelihood. It generalizes the LNA approach to circumvent the challenge of step size selection, facilitating robust learning of mechanistic parameters with provable finite-sample performance guarantees. We develop a fast and robust RAPTOR-GEN algorithm with controllable error. Numerical experiments demonstrate its effectiveness in uncovering the underlying regulatory mechanisms of biomanufacturing processes.
Related papers
- Physiologically Informed Deep Learning: A Multi-Scale Framework for Next-Generation PBPK Modeling [5.007023403094322]
We propose a unified Scientific Machine Learning (SciML) framework that bridges mechanistic rigor and data-driven flexibility.<n>We introduce three contributions: (1) Foundation PBPK Transformers, which treat pharmacokinetic forecasting as a sequence modeling task; (2) Physiologically Constrained Diffusion Models (PCDM), a generative approach that uses a physics-informed loss to synthesize biologically compliant virtual patient populations; and (3) Neural Allometry, a hybrid architecture combining Graph Neural Networks (GNNs) with Neural ODEs to learn continuous cross-species scaling laws.
arXiv Detail & Related papers (2026-02-09T00:26:01Z) - A Symbolic and Statistical Learning Framework to Discover Bioprocessing Regulatory Mechanism: Cell Culture Example [2.325005809983534]
This paper introduces a symbolic and statistical learning framework to identify key regulatory mechanisms and model uncertainty.<n>A Metropolis-adjusted Langevin algorithm with adjoint sensitivity analysis is developed for posterior exploration.<n>An empirical study demonstrates its ability to recover missing regulatory mechanisms and improve model fidelity under datalimited conditions.
arXiv Detail & Related papers (2025-05-06T04:39:34Z) - Error Broadcast and Decorrelation as a Potential Artificial and Natural Learning Mechanism [34.75158394131716]
We introduce Error Broadcast and Decorrelation (EBD), a novel learning framework for neural networks.<n>EBD addresses credit assignment by directly broadcasting output errors to individual layers.<n>Our findings establish EBD as an efficient, biologically plausible, and principled alternative for neural network training.
arXiv Detail & Related papers (2025-04-15T19:00:53Z) - GENERator: A Long-Context Generative Genomic Foundation Model [66.46537421135996]
We present GENERator, a generative genomic foundation model featuring a context length of 98k base pairs (bp) and 1.2B parameters.<n>Trained on an expansive dataset comprising 386B bp of DNA, the GENERator demonstrates state-of-the-art performance across both established and newly proposed benchmarks.<n>It also shows significant promise in sequence optimization, particularly through the prompt-responsive generation of enhancer sequences with specific activity profiles.
arXiv Detail & Related papers (2025-02-11T05:39:49Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Weakly Supervised Set-Consistency Learning Improves Morphological Profiling of Single-Cell Images [0.6491172192043603]
We propose a set-level consistency learning algorithm, Set-DINO, to improve learned representations of perturbation effects in single-cell images.
We conduct experiments on a large-scale Optical Pooled Screening dataset with more than 5000 genetic perturbations.
arXiv Detail & Related papers (2024-06-08T00:53:30Z) - BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments [112.25067497985447]
We introduce BioDiscoveryAgent, an agent that designs new experiments, reasons about their outcomes, and efficiently navigates the hypothesis space to reach desired solutions.<n>BioDiscoveryAgent can uniquely design new experiments without the need to train a machine learning model.<n>It achieves an average of 21% improvement in predicting relevant genetic perturbations across six datasets.
arXiv Detail & Related papers (2024-05-27T19:57:17Z) - DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment
Design [61.48963555382729]
We propose DiscoBAX as a sample-efficient method for maximizing the rate of significant discoveries per experiment.
We provide theoretical guarantees of approximate optimality under standard assumptions, and conduct a comprehensive experimental evaluation.
arXiv Detail & Related papers (2023-12-07T06:05:39Z) - Machine learning enabled experimental design and parameter estimation
for ultrafast spin dynamics [54.172707311728885]
We introduce a methodology that combines machine learning with Bayesian optimal experimental design (BOED)
Our method employs a neural network model for large-scale spin dynamics simulations for precise distribution and utility calculations in BOED.
Our numerical benchmarks demonstrate the superior performance of our method in guiding XPFS experiments, predicting model parameters, and yielding more informative measurements within limited experimental time.
arXiv Detail & Related papers (2023-06-03T06:19:20Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Opportunities of Hybrid Model-based Reinforcement Learning for Cell
Therapy Manufacturing Process Development and Control [6.580930850408461]
Key challenges of cell therapy manufacturing include high complexity, high uncertainty, and very limited process data.
We propose a framework named "hybridRL" to efficiently guide process development and control.
In the empirical study, cell therapy manufacturing examples are used to demonstrate that the proposed hybrid-RL framework can outperform the classical deterministic mechanistic model assisted process optimization.
arXiv Detail & Related papers (2022-01-10T00:01:19Z) - Policy Optimization in Bayesian Network Hybrid Models of
Biomanufacturing Processes [3.124775036986647]
Biomanufacturing processes require close monitoring and control.
We develop a novel model-based reinforcement learning framework that can achieve human-level control in low-data environments.
arXiv Detail & Related papers (2021-05-13T20:39:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.