PREFER: An Ontology for the PREcision FERmentation Community
- URL: http://arxiv.org/abs/2602.16755v1
- Date: Wed, 18 Feb 2026 09:29:30 GMT
- Title: PREFER: An Ontology for the PREcision FERmentation Community
- Authors: Txell Amigó, Shawn Zheng Kai Tan, Angel Luu Phanthanourak, Sebastian Schulz, Pasquale D. Colaianni, Dominik M. Maszczyk, Ester Milesi, Ivan Schlembach, Mykhaylo Semenov Petrov, Marta Reventós Montané, Lars K. Nielsen, Jochen Förster, Bernhard Ø. Palsson, Suresh Sudarsan, Alberto Santos,
- Abstract summary: PREFER is an open-source ontology designed to establish a unified standard for bioprocess data.<n> PREFER is built in alignment with the widely adopted Basic Formal Ontology (BFO)
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Precision fermentation relies on microbial cell factories to produce sustainable food, pharmaceuticals, chemicals, and biofuels. Specialized laboratories such as biofoundries are advancing these processes using high-throughput bioreactor platforms, which generate vast datasets. However, the lack of community standards limits data accessibility and interoperability, preventing integration across platforms. In order to address this, we introduce PREFER, an open-source ontology designed to establish a unified standard for bioprocess data. Built in alignment with the widely adopted Basic Formal Ontology (BFO) and connecting with several other community ontologies, PREFER ensures consistency and cross-domain compatibility and covers the whole precision fermentation process. Integrating PREFER into high-throughput bioprocess development workflows enables structured metadata that supports automated cross-platform execution and high-fidelity data capture. Furthermore, PREFER's standardization has the potential to bridge disparate data silos, generating machine-actionable datasets critical for training predictive, robust machine learning models in synthetic biology. This work provides the foundation for scalable, interoperable bioprocess systems and supports the transition toward more data-driven bioproduction.
Related papers
- TokaMark: A Comprehensive Benchmark for MAST Tokamak Plasma Models [56.94569090844015]
TokaMark is a structured benchmark to evaluate AI models on real experimental data collected from the Mega Ampere Spherical Tokamak (MAST)<n>TokaMark aims to accelerate progress in data-driven AI-based plasma modeling, contributing to the broader goal of achieving sustainable and stable fusion energy.
arXiv Detail & Related papers (2026-02-05T16:49:44Z) - Improving the Generation and Evaluation of Synthetic Data for Downstream Medical Causal Inference [89.5628648718851]
Causal inference is essential for developing and evaluating medical interventions.<n>Real-world medical datasets are often difficult to access due to regulatory barriers.<n>We present STEAM: a novel method for generating Synthetic data for Treatment Effect Analysis in Medicine.
arXiv Detail & Related papers (2025-10-21T16:16:00Z) - Quantum Synthetic Data Generation for Industrial Bioprocess Monitoring [0.0]
Data scarcity and sparsity in bio-manufacturing poses challenges for accurate model development, process monitoring, and optimization.<n>We propose the use of a Quantum Wasserstein Generative Adrial Network with Gradient Penalty (QWGAN-GP) to generate synthetic time series data for industrially relevant processes.
arXiv Detail & Related papers (2025-10-20T16:04:39Z) - CellPainTR: Generalizable Representation Learning for Cross-Dataset Cell Painting Analysis [51.56484100374058]
We introduce CellPainTR, a Transformer-based architecture designed to learn foundational representations of cellular morphology.<n>Our work represents a significant step towards creating truly foundational models for image-based profiling, enabling more reliable and scalable cross-study biological analysis.
arXiv Detail & Related papers (2025-09-02T03:30:07Z) - Causal Representation Learning from Multimodal Biomedical Observations [57.00712157758845]
We develop flexible identification conditions for multimodal data and principled methods to facilitate the understanding of biomedical datasets.<n>Key theoretical contribution is the structural sparsity of causal connections between modalities.<n>Results on a real-world human phenotype dataset are consistent with established biomedical research.
arXiv Detail & Related papers (2024-11-10T16:40:27Z) - Applications of Machine Learning in Biopharmaceutical Process
Development and Manufacturing: Current Trends, Challenges, and Opportunities [7.762212551172391]
Machine learning (ML) has made significant contributions to the biopharmaceutical field.
Its applications are still in the early stages in terms of providing direct support for quality-by-design based development and manufacturing of biopharmaceuticals.
This paper aims to provide a comprehensive review of the current applications of ML solutions in a bioproduct design, monitoring, control, and optimisation of upstream, downstream, and product formulation processes.
arXiv Detail & Related papers (2023-10-16T00:35:24Z) - Multi-fidelity Gaussian Process for Biomanufacturing Process Modeling
with Small Data [1.4687789417816917]
We propose to use a statistical machine learning approach, multi-fidelity Gaussian process, for process modelling in biomanufacturing.
We apply the multi-fidelity Gaussian process to solve two significant problems in biomanufacturing, bioreactor scale-up and knowledge transfer across cell lines, and demonstrate its efficacy on real-world datasets.
arXiv Detail & Related papers (2022-11-26T06:38:34Z) - Machine learning in bioprocess development: From promise to practice [58.720142291102135]
Data-driven methods like machine learning (ML) approaches have a high potential to rationally explore large design spaces.
The aim of this review is to demonstrate how ML methods have been applied so far in bioprocess development.
arXiv Detail & Related papers (2022-10-04T13:48:59Z) - When Bioprocess Engineering Meets Machine Learning: A Survey from the
Perspective of Automated Bioprocess Development [3.687740185234604]
Machine learning (ML) has significantly contributed to the development of bioprocess engineering, but its application is still limited.
This review provides a comprehensive overview of ML-based automation in bioprocess development.
arXiv Detail & Related papers (2022-09-02T14:30:49Z) - Policy Optimization in Bayesian Network Hybrid Models of
Biomanufacturing Processes [3.124775036986647]
Biomanufacturing processes require close monitoring and control.
We develop a novel model-based reinforcement learning framework that can achieve human-level control in low-data environments.
arXiv Detail & Related papers (2021-05-13T20:39:02Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.