Related papers: Lessons Learned from Deploying Adaptive Machine Learning Agents with Limited Data for Real-time Cell Culture Process Monitoring

Lessons Learned from Deploying Adaptive Machine Learning Agents with Limited Data for Real-time Cell Culture Process Monitoring

URL: http://arxiv.org/abs/2509.02606v1
Date: Fri, 29 Aug 2025 22:26:13 GMT
Title: Lessons Learned from Deploying Adaptive Machine Learning Agents with Limited Data for Real-time Cell Culture Process Monitoring
Authors: Thanh Tung Khuat, Johnny Peng, Robert Bassett, Ellen Otte, Bogdan Gabrys,
Abstract summary: This study explores the deployment of three machine learning (ML) approaches for real-time prediction of glucose, lactate, and ammonium concentrations in cell culture processes.<n>The research addresses challenges associated with limited data availability and process variability.<n>Two industrial case studies are presented to evaluate the impact of varying bioprocess conditions on model performance.
Score: 4.920530441985874
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This study explores the deployment of three machine learning (ML) approaches for real-time prediction of glucose, lactate, and ammonium concentrations in cell culture processes, using Raman spectroscopy as input features. The research addresses challenges associated with limited data availability and process variability, providing a comparative analysis of pretrained models, just-in-time learning (JITL), and online learning algorithms. Two industrial case studies are presented to evaluate the impact of varying bioprocess conditions on model performance. The findings highlight the specific conditions under which pretrained models demonstrate superior predictive accuracy and identify scenarios where JITL or online learning approaches are more effective for adaptive process monitoring. This study also highlights the critical importance of updating the deployed models/agents with the latest offline analytical measurements during bioreactor operations to maintain the model performance against the changes in cell growth behaviours and operating conditions throughout the bioreactor run. Additionally, the study confirms the usefulness of a simple mixture-of-experts framework in achieving enhanced accuracy and robustness for real-time predictions of metabolite concentrations based on Raman spectral data. These insights contribute to the development of robust strategies for the efficient deployment of ML models in dynamic and changing biomanufacturing environments.

Related papers

Learning From Limited Data and Feedback for Cell Culture Process Monitoring: A Comparative Study [7.573810945509749]
In cell culture bioprocessing, real-time batch process monitoring (BPM) refers to the continuous tracking and analysis of key process variables.<n>This study presents a benchmarking analysis of machine learning (ML) methods designed to address these challenges.<n>We evaluate multiple ML approaches including feature dimensionality reduction, online learning, and just-in-time learning across three datasets.
arXiv Detail & Related papers (2025-12-03T05:28:33Z)
Strategies to Minimize Out-of-Distribution Effects in Data-Driven MRS Quantification [16.060904490566383]
This study systematically compared data-driven and model-based strategies for metabolite quantification in magnetic resonance spectroscopy (MRS)<n>Supervised learning achieved high accuracy for spectra similar to those in the training distribution, but showed marked degradation when extrapolated beyond the training distribution.<n>Test-time adaptation proved more resilient to OoD effects, while self-supervised learning achieved intermediate performance.
arXiv Detail & Related papers (2025-11-28T12:33:05Z)
From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z)
Meta-Learning Linear Models for Molecular Property Prediction [3.9685594339912633]
We introduce LAMeL - a Linear Algorithm for Meta-Learning that preserves interpretability while improving the prediction accuracy across multiple properties.<n>Our method delivers performance improvements ranging from 1.1- to 25-fold over standard ridge regression, depending on the domain of the dataset.
arXiv Detail & Related papers (2025-09-16T20:41:45Z)
A Symbolic and Statistical Learning Framework to Discover Bioprocessing Regulatory Mechanism: Cell Culture Example [2.325005809983534]
This paper introduces a symbolic and statistical learning framework to identify key regulatory mechanisms and model uncertainty.<n>A Metropolis-adjusted Langevin algorithm with adjoint sensitivity analysis is developed for posterior exploration.<n>An empirical study demonstrates its ability to recover missing regulatory mechanisms and improve model fidelity under datalimited conditions.
arXiv Detail & Related papers (2025-05-06T04:39:34Z)
Overview and practical recommendations on using Shapley Values for identifying predictive biomarkers via CATE modeling [0.6990493129893112]
Shapley Additive Explanations (SHAP) has become mainstream in data science for analyzing supervised learning models.<n>We introduce a surrogate estimation approach that is agnostic to the choice of CATE strategy.<n>We conduct simulation benchmarking to evaluate the ability to accurately identify biomarkers using SHAP values derived from various CATE meta-learners and Causal Forest.
arXiv Detail & Related papers (2025-05-02T09:44:04Z)
Hybrid machine learning data assimilation for marine biogeochemistry [0.2383122657918106]
Marine biogeochemistry models are critical for forecasting, as well as estimating ecosystem responses to climate change and human activities.<n>Existing DA methods struggle to update unobserved variables effectively, while ensemble-based methods are computationally too expensive for high-complexity models.<n>This study demonstrates how machine learning can improve marine biogeochemistry DA by learning statistical relationships between observed and unobserved variables.
arXiv Detail & Related papers (2025-04-07T16:04:10Z)
What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy. By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z)
Uncertainty Quantification Using Ensemble Learning and Monte Carlo Sampling for Performance Prediction and Monitoring in Cell Culture Processes [7.762212551172391]
Monoclonal antibodies (mAbs) have gained prominence in the pharmaceutical market due to their high specificity and efficacy. The application of machine learning models in mAb development and manufacturing is gaining momentum. This paper addresses the critical need for uncertainty quantification in machine learning predictions.
arXiv Detail & Related papers (2024-09-03T09:38:32Z)
Predictive Analytics of Varieties of Potatoes [2.336821989135698]
We explore the application of machine learning algorithms specifically to enhance the selection process of Russet potato clones in breeding trials. This study addresses the challenge of efficiently identifying high-yield, disease-resistant, and climate-resilient potato varieties.
arXiv Detail & Related papers (2024-04-04T00:49:05Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
Learning Active Learning (LAL) suggests to learn the active learning strategy itself, allowing it to adapt to the given setting. We propose a novel LAL method for classification that exploits symmetry and independence properties of the active learning problem. Our approach is based on learning from a myopic oracle, which gives our model the ability to adapt to non-standard objectives.
arXiv Detail & Related papers (2023-09-11T14:16:37Z)
Towards an Automatic Analysis of CHO-K1 Suspension Growth in Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data. Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
Deep Low-Shot Learning for Biological Image Classification and Visualization from Limited Training Samples [52.549928980694695]
In situ hybridization (ISH) gene expression pattern images from the same developmental stage are compared. labeling training data with precise stages is very time-consuming even for biologists. We propose a deep two-step low-shot learning framework to accurately classify ISH images using limited training images.
arXiv Detail & Related papers (2020-10-20T06:06:06Z)
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources [78.72922528736011]
We propose a novel approach, black-box adversarial reprogramming (BAR), that repurposes a well-trained black-box machine learning model. Using zeroth order optimization and multi-label mapping techniques, BAR can reprogram a black-box ML model solely based on its input-output responses. BAR outperforms state-of-the-art methods and yields comparable performance to the vanilla adversarial reprogramming method.
arXiv Detail & Related papers (2020-07-17T01:52:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.