Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data
- URL: http://arxiv.org/abs/2504.12287v1
- Date: Wed, 16 Apr 2025 17:51:59 GMT
- Title: Trend Filtered Mixture of Experts for Automated Gating of High-Frequency Flow Cytometry Data
- Authors: Sangwon Hyun, Tim Coleman, Francois Ribalet, Jacob Bien,
- Abstract summary: Ocean microbes are critical to both ocean ecosystems and the global climate.<n>Despite decades of accumulated data, identifying key microbial populations remains a significant analytical challenge.<n>We propose a novel mixture-of-experts model in which both the gating function and the experts are given by trend filtering.
- Score: 3.541118865937421
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Ocean microbes are critical to both ocean ecosystems and the global climate. Flow cytometry, which measures cell optical properties in fluid samples, is routinely used in oceanographic research. Despite decades of accumulated data, identifying key microbial populations (a process known as ``gating'') remains a significant analytical challenge. To address this, we focus on gating multidimensional, high-frequency flow cytometry data collected {\it continuously} on board oceanographic research vessels, capturing time- and space-wise variations in the dynamic ocean. Our paper proposes a novel mixture-of-experts model in which both the gating function and the experts are given by trend filtering. The model leverages two key assumptions: (1) Each snapshot of flow cytometry data is a mixture of multivariate Gaussians and (2) the parameters of these Gaussians vary smoothly over time. Our method uses regularization and a constraint to ensure smoothness and that cluster means match biologically distinct microbe types. We demonstrate, using flow cytometry data from the North Pacific Ocean, that our proposed model accurately matches human-annotated gating and corrects significant errors.
Related papers
- Analyzing Spatio-Temporal Dynamics of Dissolved Oxygen for the River Thames using Superstatistical Methods and Machine Learning [0.0]
We use superstatistical methods and machine learning to predict dissolved oxygen levels in the River Thames.<n>For long-term forecasting, the Informer model consistently delivers superior performance.
arXiv Detail & Related papers (2025-01-10T16:54:52Z) - Meta Flow Matching: Integrating Vector Fields on the Wasserstein Manifold [83.18058549195855]
We argue that multiple processes in natural sciences have to be represented as vector fields on the Wasserstein manifold of probability densities.<n>In particular, this is crucial for personalized medicine where the development of diseases and their respective treatment response depend on the microenvironment of cells specific to each patient.<n>We propose Meta Flow Matching (MFM), a practical approach to integrate along these vector fields on the Wasserstein manifold by amortizing the flow model over the initial populations.
arXiv Detail & Related papers (2024-08-26T20:05:31Z) - Learning rheological parameters of non-Newtonian fluids from velocimetry data [46.2482873419289]
We devise an algorithm that learns the most likely Carreau parameters of a shear-thinning fluid.<n>We show that the algorithm can successfully reconstruct the flow field by learning the most likely Carreau parameters.
arXiv Detail & Related papers (2024-08-05T16:27:38Z) - OXYGENERATOR: Reconstructing Global Ocean Deoxygenation Over a Century with Deep Learning [50.365198230613956]
Existing expert-dominated numerical simulations fail to catch up with the dynamic variation caused by global warming and human activities.
We propose OxyGenerator, the first deep learning based model, to reconstruct the global ocean deoxygenation from 1920 to 2023.
arXiv Detail & Related papers (2024-05-12T09:32:40Z) - Seeing Unseen: Discover Novel Biomedical Concepts via
Geometry-Constrained Probabilistic Modeling [53.7117640028211]
We present a geometry-constrained probabilistic modeling treatment to resolve the identified issues.
We incorporate a suite of critical geometric properties to impose proper constraints on the layout of constructed embedding space.
A spectral graph-theoretic method is devised to estimate the number of potential novel classes.
arXiv Detail & Related papers (2024-03-02T00:56:05Z) - Machine Learning for Flow Cytometry Data Analysis [0.0]
Flow cytometers can rapidly analyse tens of thousands of cells at the same time while also measuring multiple parameters from a single cell.
Researchers need to be able to distinguish interesting-looking cell populations manually in multi-dimensional data collected from millions of cells.
Three representative automated clustering algorithms are selected to be applied, compared and evaluated by completely and partially automated gating.
arXiv Detail & Related papers (2023-03-16T00:43:46Z) - Multi-Target Tobit Models for Completing Water Quality Data [0.0]
Tobit model is a well-known linear regression model for analyzing censored data.
In this study, we devised a novel extension of the classical Tobit model, called the emphmulti-target Tobit model, to handle multiple censored variables simultaneously.
Experiments conducted using several real-world water quality datasets provided evidence that estimating multiple columns jointly gains a great advantage over estimating them separately.
arXiv Detail & Related papers (2023-02-21T13:06:19Z) - Vision meets algae: A novel way for microalgae recognization and health monitor [6.731844884087066]
This dataset includes images of different genus of algae and the same genus in different states.
We trained, validated and tested the TOOD, YOLOv5, YOLOv8 and variants of RCNN algorithms on this dataset.
The results showed both one-stage and two-stage object detection models can achieve high mean average precision.
arXiv Detail & Related papers (2022-11-14T17:11:15Z) - Data-Efficient Learning via Minimizing Hyperspherical Energy [48.47217827782576]
This paper considers the problem of data-efficient learning from scratch using a small amount of representative data.
We propose a MHE-based active learning (MHEAL) algorithm, and provide comprehensive theoretical guarantees for MHEAL.
arXiv Detail & Related papers (2022-06-30T11:39:12Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Modeling Cell Populations Measured By Flow Cytometry With Covariates
Using Sparse Mixture of Regressions [2.5463557459240955]
The ocean is filled with microscopic microalgae called phytoplankton, which together are responsible for as much photosynthesis as all plants on land combined.
Our ability to predict their response to the warming ocean relies on understanding how the dynamics of phytoplankton populations is influenced by changes in environmental conditions.
Today, oceanographers are able to collect flow data in real-time onboard a moving ship, providing them with fine-scale resolution of the distribution of phytoplankton across thousands of kilometers.
arXiv Detail & Related papers (2020-08-25T20:03:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.