MMIL: A novel algorithm for disease associated cell type discovery
- URL: http://arxiv.org/abs/2406.08322v1
- Date: Wed, 12 Jun 2024 15:22:56 GMT
- Title: MMIL: A novel algorithm for disease associated cell type discovery
- Authors: Erin Craig, Timothy Keyes, Jolanda Sarno, Maxim Zaslavsky, Garry Nolan, Kara Davis, Trevor Hastie, Robert Tibshirani,
- Abstract summary: Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
- Score: 58.044870442206914
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease. To address this, we introduce Mixture Modeling for Multiple Instance Learning (MMIL), an expectation maximization method that enables the training and calibration of cell-level classifiers using patient-level labels. Our approach can be used to train e.g. lasso logistic regression models, gradient boosted trees, and neural networks. When applied to clinically-annotated, primary patient samples in Acute Myeloid Leukemia (AML) and Acute Lymphoblastic Leukemia (ALL), our method accurately identifies cancer cells, generalizes across tissues and treatment timepoints, and selects biologically relevant features. In addition, MMIL is capable of incorporating cell labels into model training when they are known, providing a powerful framework for leveraging both labeled and unlabeled data simultaneously. Mixture Modeling for MIL offers a novel approach for cell classification, with significant potential to advance disease understanding and management, especially in scenarios with unknown gold-standard labels and high dimensionality.
Related papers
- Self-Supervised Multiple Instance Learning for Acute Myeloid Leukemia Classification [1.1874560263468232]
Diseases like Acute Myeloid Leukemia (AML) pose challenges due to scarce and costly annotations on a single-cell level.
Multiple Instance Learning (MIL) addresses weakly labeled scenarios but necessitates powerful encoders typically trained with labeled data.
In this study, we explore Self-Supervised Learning (SSL) as a pre-training approach for MIL-based subtype AML classification from blood smears.
arXiv Detail & Related papers (2024-03-08T15:16:15Z) - FlowCyt: A Comparative Study of Deep Learning Approaches for Multi-Class Classification in Flow Cytometry Benchmarking [1.6712896227173808]
FlowCyt is the first comprehensive benchmark for multi-class single-cell classification in flowencoded data.
The dataset comprises bone marrow samples from 30 patients, with each cell characterized by twelve markers.
arXiv Detail & Related papers (2024-02-28T15:01:59Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Mixed Models with Multiple Instance Learning [51.440557223100164]
We introduce MixMIL, a framework integrating Generalized Linear Mixed Models (GLMM) and Multiple Instance Learning (MIL)
Our empirical results reveal that MixMIL outperforms existing MIL models in single-cell datasets.
arXiv Detail & Related papers (2023-11-04T16:42:42Z) - MIML: Multiplex Image Machine Learning for High Precision Cell
Classification via Mechanical Traits within Microfluidic Systems [1.1675184588181313]
We develop a novel machine learning framework, Multiplex Image Machine Learning (MIML)
MIML combines label-free cell images with biomechanical property data, harnessing the vast, often underutilized morphological information intrinsic to each cell.
This approach has led to a remarkable 98.3% accuracy in cell classification, a substantial improvement over models that only consider a single data type.
arXiv Detail & Related papers (2023-09-15T14:23:51Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Towards an Automatic Analysis of CHO-K1 Suspension Growth in
Microfluidic Single-cell Cultivation [63.94623495501023]
We propose a novel Machine Learning architecture, which allows us to infuse a neural deep network with human-powered abstraction on the level of data.
Specifically, we train a generative model simultaneously on natural and synthetic data, so that it learns a shared representation, from which a target variable, such as the cell count, can be reliably estimated.
arXiv Detail & Related papers (2020-10-20T08:36:51Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Cell Mechanics Based Computational Classification of Red Blood Cells Via
Machine Intelligence Applied to Morpho-Rheological Markers [0.0]
Unsupervised machine learning methodology is applied exclusively on morpho-rheological markers obtained by real-time deformability and fluorescence (RT-FDC)
Our approach reports promising label-free results in the classification of reticulocytes from mature red blood cells.
arXiv Detail & Related papers (2020-03-02T15:11:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.