Learning From Limited Data and Feedback for Cell Culture Process Monitoring: A Comparative Study
- URL: http://arxiv.org/abs/2512.03460v1
- Date: Wed, 03 Dec 2025 05:28:33 GMT
- Title: Learning From Limited Data and Feedback for Cell Culture Process Monitoring: A Comparative Study
- Authors: Johnny Peng, Thanh Tung Khuat, Ellen Otte, Katarzyna Musial, Bogdan Gabrys,
- Abstract summary: In cell culture bioprocessing, real-time batch process monitoring (BPM) refers to the continuous tracking and analysis of key process variables.<n>This study presents a benchmarking analysis of machine learning (ML) methods designed to address these challenges.<n>We evaluate multiple ML approaches including feature dimensionality reduction, online learning, and just-in-time learning across three datasets.
- Score: 7.573810945509749
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In cell culture bioprocessing, real-time batch process monitoring (BPM) refers to the continuous tracking and analysis of key process variables such as viable cell density, nutrient levels, metabolite concentrations, and product titer throughout the duration of a batch run. This enables early detection of deviations and supports timely control actions to ensure optimal cell growth and product quality. BPM plays a critical role in ensuring the quality and regulatory compliance of biopharmaceutical manufacturing processes. However, the development of accurate soft sensors for BPM is hindered by key challenges, including limited historical data, infrequent feedback, heterogeneous process conditions, and high-dimensional sensory inputs. This study presents a comprehensive benchmarking analysis of machine learning (ML) methods designed to address these challenges, with a focus on learning from historical data with limited volume and relevance in the context of bioprocess monitoring. We evaluate multiple ML approaches including feature dimensionality reduction, online learning, and just-in-time learning across three datasets, one in silico dataset and two real-world experimental datasets. Our findings highlight the importance of training strategies in handling limited data and feedback, with batch learning proving effective in homogeneous settings, while just-in-time learning and online learning demonstrate superior adaptability in cold-start scenarios. Additionally, we identify key meta-features, such as feed media composition and process control strategies, that significantly impact model transferability. The results also suggest that integrating Raman-based predictions with lagged offline measurements enhances monitoring accuracy, offering a promising direction for future bioprocess soft sensor development.
Related papers
- Quantum Synthetic Data Generation for Industrial Bioprocess Monitoring [0.0]
Data scarcity and sparsity in bio-manufacturing poses challenges for accurate model development, process monitoring, and optimization.<n>We propose the use of a Quantum Wasserstein Generative Adrial Network with Gradient Penalty (QWGAN-GP) to generate synthetic time series data for industrially relevant processes.
arXiv Detail & Related papers (2025-10-20T16:04:39Z) - From Physics to Machine Learning and Back: Part II - Learning and Observational Bias in PHM [52.64097278841485]
Review examines how incorporating learning and observational biases through physics-informed modeling and data strategies can guide models toward physically consistent and reliable predictions.<n>Fast adaptation methods including meta-learning and few-shot learning are reviewed alongside domain generalization techniques.
arXiv Detail & Related papers (2025-09-25T14:15:43Z) - Lessons Learned from Deploying Adaptive Machine Learning Agents with Limited Data for Real-time Cell Culture Process Monitoring [4.920530441985874]
This study explores the deployment of three machine learning (ML) approaches for real-time prediction of glucose, lactate, and ammonium concentrations in cell culture processes.<n>The research addresses challenges associated with limited data availability and process variability.<n>Two industrial case studies are presented to evaluate the impact of varying bioprocess conditions on model performance.
arXiv Detail & Related papers (2025-08-29T22:26:13Z) - Machine Learning Methods for Small Data and Upstream Bioprocessing Applications: A Comprehensive Review [13.205760966688619]
Data is crucial for machine learning (ML) applications, yet acquiring large datasets can be costly and time-consuming.<n>This review explores ML methods designed to address the challenges posed by small data and classifies them into a taxonomy to guide practical applications.<n>By analysing how these methods tackle small data challenges from different perspectives, this review provides actionable insights.
arXiv Detail & Related papers (2025-06-14T03:13:05Z) - Uncertainty Quantification Using Ensemble Learning and Monte Carlo Sampling for Performance Prediction and Monitoring in Cell Culture Processes [7.762212551172391]
Monoclonal antibodies (mAbs) have gained prominence in the pharmaceutical market due to their high specificity and efficacy.
The application of machine learning models in mAb development and manufacturing is gaining momentum.
This paper addresses the critical need for uncertainty quantification in machine learning predictions.
arXiv Detail & Related papers (2024-09-03T09:38:32Z) - Clairvoyance: A Pipeline Toolkit for Medical Time Series [95.22483029602921]
Time-series learning is the bread and butter of data-driven *clinical decision support*
Clairvoyance proposes a unified, end-to-end, autoML-friendly pipeline that serves as a software toolkit.
Clairvoyance is the first to demonstrate viability of a comprehensive and automatable pipeline for clinical time-series ML.
arXiv Detail & Related papers (2023-10-28T12:08:03Z) - Machine learning in bioprocess development: From promise to practice [58.720142291102135]
Data-driven methods like machine learning (ML) approaches have a high potential to rationally explore large design spaces.
The aim of this review is to demonstrate how ML methods have been applied so far in bioprocess development.
arXiv Detail & Related papers (2022-10-04T13:48:59Z) - SSM-DTA: Breaking the Barriers of Data Scarcity in Drug-Target Affinity
Prediction [127.43571146741984]
Drug-Target Affinity (DTA) is of vital importance in early-stage drug discovery.
wet experiments remain the most reliable method, but they are time-consuming and resource-intensive.
Existing methods have primarily focused on developing techniques based on the available DTA data, without adequately addressing the data scarcity issue.
We present the SSM-DTA framework, which incorporates three simple yet highly effective strategies.
arXiv Detail & Related papers (2022-06-20T14:53:25Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost
Functions [80.12620331438052]
deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features.
Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets.
We argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance.
arXiv Detail & Related papers (2020-06-25T08:46:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.