CytoDINO: Risk-Aware and Biologically-Informed Adaptation of DINOv3 for Bone Marrow Cytomorphology
- URL: http://arxiv.org/abs/2512.17930v1
- Date: Tue, 09 Dec 2025 23:09:22 GMT
- Title: CytoDINO: Risk-Aware and Biologically-Informed Adaptation of DINOv3 for Bone Marrow Cytomorphology
- Authors: Aziz Muminov, Anne Pham,
- Abstract summary: We introduce CytoDINO, a framework that achieves state-of-the-art performance on the Munich Leukemia Laboratory dataset.<n>Our primary contribution is a novel Hierarchical Focal Loss with Critical Penalties, which encodes biological relationships between cell lineages and explicitly penalizes clinically dangerous misclassifications.<n>CytoDINO achieves an 88.2% weighted F1 score and 76.5% macro F1 on a held-out test set of 21 cell classes.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bone marrow cell cytomorphology analysis is critical for the diagnosis of hematological malignancies but remains a labor-intensive process subject to significant inter-observer variability. While recent foundation models have shown promise in computational pathology, they often require extensive computational resources and fail to account for the asymmetric risks associated with clinical misdiagnosis. We introduce CytoDINO, a framework that achieves state-of-the-art performance on the Munich Leukemia Laboratory (MLL) dataset by fine-tuning DINOv3 using Low-Rank Adaptation (LoRA). Our primary contribution is a novel Hierarchical Focal Loss with Critical Penalties, which encodes biological relationships between cell lineages and explicitly penalizes clinically dangerous misclassifications (e.g., classifying blasts as normal cells). CytoDINO achieves an 88.2% weighted F1 score and 76.5% macro F1 on a held-out test set of 21 cell classes. By utilizing parameter-efficient fine-tuning with only 8% trainable parameters on a single NVIDIA RTX 5080, we demonstrate that consumer-grade hardware can match specialized infrastructure. Furthermore, confidence-based selective prediction yields 99.5% accuracy on 67% of samples, suggesting a viable pathway for clinical deployment where high-uncertainty cases are flagged for expert review
Related papers
- Enhanced Leukemic Cell Classification Using Attention-Based CNN and Data Augmentation [0.5371337604556311]
Acute lymphoblastic leukemia (ALL) is the most common childhood cancer, requiring expert microscopic diagnosis.<n>The proposed system integrates an attention-based convolutional neural network combining EfficientNetV2-B3 with Squeeze-and-Excitation mechanisms for automated ALL cell classification.<n>Our approach employs comprehensive data augmentation, focal loss for class imbalance, and patient-wise data splitting to ensure robust and reproducible evaluation.
arXiv Detail & Related papers (2026-01-03T01:24:11Z) - One-shot synthesis of rare gastrointestinal lesions improves diagnostic accuracy and clinical training [45.49415063761575]
EndoRare is a one-shot, retraining-free generative framework that synthesizes diverse, high-fidelity lesion exemplars from a single reference image.<n>We validated the framework across four rare pathologies.<n>These results establish a practical, data-efficient pathway to bridge the rare-disease gap in both computer-aided diagnostics and clinical education.
arXiv Detail & Related papers (2025-12-30T15:07:09Z) - Transparent Early ICU Mortality Prediction with Clinical Transformer and Per-Case Modality Attribution [42.85462513661566]
We present a lightweight, transparent multimodal ensemble that fuses physiological time-series measurements with unstructured clinical notes from the first 48 hours of an ICU stay.<n>A logistic regression model combines predictions from two modality-specific models: a bidirectional LSTM for vitals and a finetuned ClinicalModernBERT transformer for notes.<n>On the MIMIC-III benchmark, our late-fusion ensemble improves discrimination over the best single model while maintaining well-calibrated predictions.
arXiv Detail & Related papers (2025-11-19T20:11:49Z) - An Explainable and Fair AI Tool for PCOS Risk Assessment: Calibration, Subgroup Equity, and Interactive Clinical Deployment [0.10026496861838446]
This paper presents a fairness-audited and interpretable machine learning framework for predicting polycystic ovary syndrome (PCOS)<n>The framework integrated SHAP-based feature attributions with demographic audits to connect predictive explanations with observed disparities for actionable insights.<n>A Streamlit-based web interface enables real-time PCOS risk assessment, Rotterdam criteria evaluation, and interactive 'what-if' analysis.
arXiv Detail & Related papers (2025-11-08T16:14:56Z) - An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection [55.35661671061754]
Tuberculosis remains a critical global health issue, particularly in resource-limited and remote areas.<n>We propose a framework which enhances disease and symptom detection on chest X-rays by integrating two supervised heads and a self-supervised head.<n>Our model achieves an accuracy of 98.85% for distinguishing between COVID-19, tuberculosis, and normal cases, and a macro-F1 score of 90.09% for multilabel symptom detection.
arXiv Detail & Related papers (2025-10-21T17:18:55Z) - LGE-Guided Cross-Modality Contrastive Learning for Gadolinium-Free Cardiomyopathy Screening in Cine CMR [51.11296719862485]
We propose a Contrastive Learning and Cross-Modal alignment framework for gadolinium-free cardiomyopathy screening using cine CMR sequences.<n>By aligning the latent spaces of cine CMR and Late Gadolinium Enhancement (LGE) sequences, our model encodes fibrosis-specific pathology into cine CMR embeddings.
arXiv Detail & Related papers (2025-08-23T07:21:23Z) - An Explainable AI-Enhanced Machine Learning Approach for Cardiovascular Disease Detection and Risk Assessment [0.0]
Heart disease remains a major global health concern.<n>Traditional diagnostic methods fail to accurately identify and manage heart disease risks.<n>Machine learning has the potential to significantly enhance the accuracy, efficiency, and speed of heart disease diagnosis.
arXiv Detail & Related papers (2025-07-15T10:38:38Z) - CNN-LSTM Hybrid Model for AI-Driven Prediction of COVID-19 Severity from Spike Sequences and Clinical Data [0.0]
We developed a CNN-LSTM hybrid model to predict COVID-19 severity using spike protein sequences and clinical data.<n>The model achieved an F1 score of 82.92%, ROC-AUC of 0.9084, precision of 83.56%, and recall of 82.85%.
arXiv Detail & Related papers (2025-05-29T16:20:54Z) - CRTRE: Causal Rule Generation with Target Trial Emulation Framework [47.2836994469923]
We introduce a novel method called causal rule generation with target trial emulation framework (CRTRE)
CRTRE applies randomize trial design principles to estimate the causal effect of association rules.
We then incorporate such association rules for the downstream applications such as prediction of disease onsets.
arXiv Detail & Related papers (2024-11-10T02:40:06Z) - Optimizing Mortality Prediction for ICU Heart Failure Patients: Leveraging XGBoost and Advanced Machine Learning with the MIMIC-III Database [1.5186937600119894]
Heart failure affects millions of people worldwide, significantly reducing quality of life and leading to high mortality rates.
Despite extensive research, the relationship between heart failure and mortality rates among ICU patients is not fully understood.
This study analyzed data from 1,177 patients over 18 years old from the MIMIC-III database, identified using ICD-9 codes.
arXiv Detail & Related papers (2024-09-03T07:57:08Z) - Deep Generative Classification of Blood Cell Morphology [7.494975467007647]
We introduce CytoDiffusion, a diffusion-based classifier that effectively models blood cell morphology.
Our approach outperforms state-of-the-art discriminative models in anomaly detection.
We enhance model explainability through the generation of directly interpretable counterfactual heatmaps.
arXiv Detail & Related papers (2024-08-16T19:17:02Z) - Enhancing Diagnostic Reliability of Foundation Model with Uncertainty Estimation in OCT Images [41.002573031087856]
We developed a foundation model with uncertainty estimation (FMUE) to detect 11 retinal conditions on optical coherence tomography ( OCT)
FMUE achieved a higher F1 score of 96.76% than two state-of-the-art algorithms, RETFound and UIOS, and got further improvement with thresholding strategy to 98.44%.
Our model is superior to two ophthalmologists with a higher F1 score (95.17% vs. 61.93% &71.72%)
arXiv Detail & Related papers (2024-06-18T03:04:52Z) - Deep-Learning Tool for Early Identifying Non-Traumatic Intracranial
Hemorrhage Etiology based on CT Scan [40.51754649947294]
The deep learning model was developed with 1868 eligible NCCT scans with non-traumatic ICH collected between January 2011 and April 2018.
The model's diagnostic performance was compared with clinicians's performance.
The clinicians achieve significant improvements in the sensitivity, specificity, and accuracy of diagnoses of certain hemorrhage etiologies with proposed system augmentation.
arXiv Detail & Related papers (2023-02-02T08:45:17Z) - Comparison of Machine Learning Classifiers to Predict Patient Survival
and Genetics of GBM: Towards a Standardized Model for Clinical Implementation [44.02622933605018]
Radiomic models have been shown to outperform clinical data for outcome prediction in glioblastoma (GBM)
We aimed to compare nine machine learning classifiers to predict overall survival (OS), isocitrate dehydrogenase (IDH) mutation, O-6-methylguanine-DNA-methyltransferase (MGMT) promoter methylation, epidermal growth factor receptor (EGFR) VII amplification and Ki-67 expression in GBM patients.
xGB obtained maximum accuracy for OS (74.5%), AB for IDH mutation (88%), MGMT methylation (71,7%), Ki-67 expression (86,6%), and EGFR amplification (81,
arXiv Detail & Related papers (2021-02-10T15:10:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.