Deep Learning Approaches for Blood Disease Diagnosis Across Hematopoietic Lineages
- URL: http://arxiv.org/abs/2503.20049v1
- Date: Tue, 25 Mar 2025 20:11:10 GMT
- Title: Deep Learning Approaches for Blood Disease Diagnosis Across Hematopoietic Lineages
- Authors: Gabriel Bo, Justin Gu, Christopher Sun,
- Abstract summary: We present a foundation modeling framework that leverages deep learning to uncover latent genetic signatures across the hematopoietic hierarchy.<n>Our approach trains a fully connected autoencoder on multipotent progenitor cells, reducing over 20,000 gene features to a 256-dimensional latent space.<n>We validate the quality of these embeddings by training feed-forward, transformer, and graph convolutional architectures for blood disease diagnosis tasks.<n>Our models achieve greater than 95% accuracy for multi-class classification, and in the zero-shot setting, we achieve greater than 0.7 F1-score on the binary classification task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a foundation modeling framework that leverages deep learning to uncover latent genetic signatures across the hematopoietic hierarchy. Our approach trains a fully connected autoencoder on multipotent progenitor cells, reducing over 20,000 gene features to a 256-dimensional latent space that captures predictive information for both progenitor and downstream differentiated cells such as monocytes and lymphocytes. We validate the quality of these embeddings by training feed-forward, transformer, and graph convolutional architectures for blood disease diagnosis tasks. We also explore zero-shot prediction using a progenitor disease state classification model to classify downstream cell conditions. Our models achieve greater than 95% accuracy for multi-class classification, and in the zero-shot setting, we achieve greater than 0.7 F1-score on the binary classification task. Future work should improve embeddings further to increase robustness on lymphocyte classification specifically.
Related papers
- TEDDY: A Family Of Foundation Models For Understanding Single Cell Biology [6.289686541194788]
Existing foundation models either do not improve or only modestly improve over task-specific models in downstream applications.<n>We scaled the pre-training dataset to 116 million cells, which is larger than those used by previous models.<n>We trained the TEDDY family of models comprising six transformer-based state-of-the-art single-cell foundation models with 70 million, 160 million, and 400 million parameters.
arXiv Detail & Related papers (2025-03-05T13:24:57Z) - Detection and Classification of Acute Lymphoblastic Leukemia Utilizing Deep Transfer Learning [0.0]
This study proposes a novel approach for diagnosing leukemia across four stages.<n>We employed two Convolutional Neural Network (CNN) models as MobileNetV2 with an altered head and a custom model.<n>The custom model achieved an accuracy of 98.6%, while MobileNetV2 attained a superior accuracy of 99.69%.
arXiv Detail & Related papers (2025-01-24T04:16:03Z) - MMIL: A novel algorithm for disease associated cell type discovery [58.044870442206914]
Single-cell datasets often lack individual cell labels, making it challenging to identify cells associated with disease.
We introduce Mixture Modeling for Multiple Learning Instance (MMIL), an expectation method that enables the training and calibration of cell-level classifiers.
arXiv Detail & Related papers (2024-06-12T15:22:56Z) - Using Pre-training and Interaction Modeling for ancestry-specific disease prediction in UK Biobank [69.90493129893112]
Recent genome-wide association studies (GWAS) have uncovered the genetic basis of complex traits, but show an under-representation of non-European descent individuals.
Here, we assess whether we can improve disease prediction across diverse ancestries using multiomic data.
arXiv Detail & Related papers (2024-04-26T16:39:50Z) - DinoBloom: A Foundation Model for Generalizable Cell Embeddings in Hematology [1.3551232282678036]
We introduce DinoBloom, the first foundation model for single cell images in hematology.
Our model is built upon an extensive collection of 13 diverse, publicly available datasets of peripheral blood and bone marrow smears.
A family of four DinoBloom models can be adapted for a wide range of downstream applications.
arXiv Detail & Related papers (2024-04-07T17:25:52Z) - Tertiary Lymphoid Structures Generation through Graph-based Diffusion [54.37503714313661]
In this work, we leverage state-of-the-art graph-based diffusion models to generate biologically meaningful cell-graphs.
We show that the adopted graph diffusion model is able to accurately learn the distribution of cells in terms of their tertiary lymphoid structures (TLS) content.
arXiv Detail & Related papers (2023-10-10T14:37:17Z) - A Continual Learning Approach for Cross-Domain White Blood Cell
Classification [36.482007703764154]
We propose a rehearsal-based continual learning approach for class incremental and domain incremental scenarios in white blood cell classification.
To choose representative samples from previous tasks, we employ set selection based on the model's predictions.
We thoroughly evaluated our proposed approach on three white blood cell classification datasets that differ in color, resolution, and class composition.
arXiv Detail & Related papers (2023-08-24T09:38:54Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - Deep CNNs for Peripheral Blood Cell Classification [0.0]
We benchmark 27 popular deep convolutional neural network architectures on the microscopic peripheral blood cell images dataset.
We fine-tune the state-of-the-art image classification models pre-trained on the ImageNet dataset for blood cell classification.
arXiv Detail & Related papers (2021-10-18T17:56:07Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - 1-D Convlutional Neural Networks for the Analysis of Pupil Size
Variations in Scotopic Conditions [79.71065005161566]
1-D convolutional neural network models are trained for classification of short-range sequences.
Model provides prediction with high average accuracy on a hold out test set.
arXiv Detail & Related papers (2020-02-06T17:25:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.