ActiveSSF: An Active-Learning-Guided Self-Supervised Framework for Long-Tailed Megakaryocyte Classification
- URL: http://arxiv.org/abs/2502.08200v1
- Date: Wed, 12 Feb 2025 08:24:36 GMT
- Title: ActiveSSF: An Active-Learning-Guided Self-Supervised Framework for Long-Tailed Megakaryocyte Classification
- Authors: Linghao Zhuang, Ying Zhang, Gege Yuan, Xingyue Zhao, Zhiping Jiang,
- Abstract summary: We propose the ActiveSSF framework, which integrates active learning with self-supervised pretraining.<n> Experimental results on clinical megakaryocyte datasets demonstrate that ActiveSSF achieves state-of-the-art performance.<n>To foster further research, the code and datasets will be publicly released in the future.
- Score: 3.6535793744942318
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Precise classification of megakaryocytes is crucial for diagnosing myelodysplastic syndromes. Although self-supervised learning has shown promise in medical image analysis, its application to classifying megakaryocytes in stained slides faces three main challenges: (1) pervasive background noise that obscures cellular details, (2) a long-tailed distribution that limits data for rare subtypes, and (3) complex morphological variations leading to high intra-class variability. To address these issues, we propose the ActiveSSF framework, which integrates active learning with self-supervised pretraining. Specifically, our approach employs Gaussian filtering combined with K-means clustering and HSV analysis (augmented by clinical prior knowledge) for accurate region-of-interest extraction; an adaptive sample selection mechanism that dynamically adjusts similarity thresholds to mitigate class imbalance; and prototype clustering on labeled samples to overcome morphological complexity. Experimental results on clinical megakaryocyte datasets demonstrate that ActiveSSF not only achieves state-of-the-art performance but also significantly improves recognition accuracy for rare subtypes. Moreover, the integration of these advanced techniques further underscores the practical potential of ActiveSSF in clinical settings. To foster further research, the code and datasets will be publicly released in the future.
Related papers
- Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z) - Transformer-Based Self-Supervised Learning for Histopathological Classification of Ischemic Stroke Clot Origin [0.0]
Identifying the thromboembolism source in ischemic stroke is crucial for treatment and secondary prevention.
This study describes a self-supervised deep learning approach in digital pathology of emboli for classifying ischemic stroke clot origin.
arXiv Detail & Related papers (2024-05-01T23:40:12Z) - A Nasal Cytology Dataset for Object Detection and Deep Learning [0.0]
We present the first dataset of rhino-cytological field images: the NCD (Nasal Cytology dataset)
The real distribution of the cytotypes, populating the nasal mucosa has been replicated, sampling images from slides of clinical patients, and manually annotating each cell found on them.
This work contributes to some of open challenges by presenting a novel machine learning-based approach to aid the automated detection and classification of nasal mucosa cells.
arXiv Detail & Related papers (2024-04-21T19:02:38Z) - UniCell: Universal Cell Nucleus Classification via Prompt Learning [76.11864242047074]
We propose a universal cell nucleus classification framework (UniCell)
It employs a novel prompt learning mechanism to uniformly predict the corresponding categories of pathological images from different dataset domains.
In particular, our framework adopts an end-to-end architecture for nuclei detection and classification, and utilizes flexible prediction heads for adapting various datasets.
arXiv Detail & Related papers (2024-02-20T11:50:27Z) - Optimizing Skin Lesion Classification via Multimodal Data and Auxiliary
Task Integration [54.76511683427566]
This research introduces a novel multimodal method for classifying skin lesions, integrating smartphone-captured images with essential clinical and demographic information.
A distinctive aspect of this method is the integration of an auxiliary task focused on super-resolution image prediction.
The experimental evaluations have been conducted using the PAD-UFES20 dataset, applying various deep-learning architectures.
arXiv Detail & Related papers (2024-02-16T05:16:20Z) - Accurate Leukocyte Detection Based on Deformable-DETR and Multi-Level
Feature Fusion for Aiding Diagnosis of Blood Diseases [5.788342067882157]
This paper proposes an innovative method of leukocyte detection: the Multi-level Feature Fusion and Deformable Self-attention DETR (MFDS-DETR)
This model uses high-level features as weights to filter low-level feature information via a channel attention module.
We address the issue of leukocyte feature scarcity by incorporating a multi-scale deformable self-attention module in the encoder.
arXiv Detail & Related papers (2024-01-01T16:28:30Z) - Single-Cell Deep Clustering Method Assisted by Exogenous Gene
Information: A Novel Approach to Identifying Cell Types [50.55583697209676]
We develop an attention-enhanced graph autoencoder, which is designed to efficiently capture the topological features between cells.
During the clustering process, we integrated both sets of information and reconstructed the features of both cells and genes to generate a discriminative representation.
This research offers enhanced insights into the characteristics and distribution of cells, thereby laying the groundwork for early diagnosis and treatment of diseases.
arXiv Detail & Related papers (2023-11-28T09:14:55Z) - Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites:
A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area.
We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions.
We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z) - Class Attention to Regions of Lesion for Imbalanced Medical Image
Recognition [59.28732531600606]
We propose a framework named textbfClass textbfAttention to textbfREgions of the lesion (CARE) to handle data imbalance issues.
The CARE framework needs bounding boxes to represent the lesion regions of rare diseases.
Results show that the CARE variants with automated bounding box generation are comparable to the original CARE framework.
arXiv Detail & Related papers (2023-07-19T15:19:02Z) - Successive Subspace Learning for Cardiac Disease Classification with
Two-phase Deformation Fields from Cine MRI [36.044984400761535]
This work proposes a lightweight successive subspace learning framework for CVD classification.
It is based on an interpretable feedforward design, in conjunction with a cardiac atlas.
Compared with 3D CNN-based approaches, our framework achieves superior classification performance with 140$times$ fewer parameters.
arXiv Detail & Related papers (2023-01-21T15:00:59Z) - LifeLonger: A Benchmark for Continual Disease Classification [59.13735398630546]
We introduce LifeLonger, a benchmark for continual disease classification on the MedMNIST collection.
Task and class incremental learning of diseases address the issue of classifying new samples without re-training the models from scratch.
Cross-domain incremental learning addresses the issue of dealing with datasets originating from different institutions while retaining the previously obtained knowledge.
arXiv Detail & Related papers (2022-04-12T12:25:05Z) - Cross-Site Severity Assessment of COVID-19 from CT Images via Domain
Adaptation [64.59521853145368]
Early and accurate severity assessment of Coronavirus disease 2019 (COVID-19) based on computed tomography (CT) images offers a great help to the estimation of intensive care unit event.
To augment the labeled data and improve the generalization ability of the classification model, it is necessary to aggregate data from multiple sites.
This task faces several challenges including class imbalance between mild and severe infections, domain distribution discrepancy between sites, and presence of heterogeneous features.
arXiv Detail & Related papers (2021-09-08T07:56:51Z) - Ensemble of CNN classifiers using Sugeno Fuzzy Integral Technique for
Cervical Cytology Image Classification [1.6986898305640261]
We propose a fully automated computer-aided diagnosis tool for classifying single-cell and slide images of cervical cancer.
We use the Sugeno Fuzzy Integral to ensemble the decision scores from three popular deep learning models, namely, Inception v3, DenseNet-161 and ResNet-34.
arXiv Detail & Related papers (2021-08-21T08:41:41Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Data Efficient and Weakly Supervised Computational Pathology on Whole
Slide Images [4.001273534300757]
computational pathology has the potential to enable objective diagnosis, therapeutic response prediction and identification of new morphological features of clinical relevance.
Deep learning-based computational pathology approaches either require manual annotation of gigapixel whole slide images (WSIs) in fully-supervised settings or thousands of WSIs with slide-level labels in a weakly-supervised setting.
Here we present CLAM - Clustering-constrained attention multiple instance learning.
arXiv Detail & Related papers (2020-04-20T23:00:13Z) - An Efficient Framework for Automated Screening of Clinically Significant
Macular Edema [0.41998444721319206]
The present study proposes a new approach to automated screening of Clinically Significant Macular Edema (CSME)
The proposed approach combines a pre-trained deep neural network with meta-heuristic feature selection.
A feature space over-sampling technique is being used to overcome the effects of skewed datasets.
arXiv Detail & Related papers (2020-01-20T07:34:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.