Foundation Models for Slide-level Cancer Subtyping in Digital Pathology
- URL: http://arxiv.org/abs/2410.15886v1
- Date: Mon, 21 Oct 2024 11:04:58 GMT
- Title: Foundation Models for Slide-level Cancer Subtyping in Digital Pathology
- Authors: Pablo Meseguer, RocĂo del Amor, Adrian Colomer, Valery Naranjo,
- Abstract summary: This work aims to compare the performance of various feature extractors developed under different pretraining strategies for cancer subtyping on WSI under a MIL framework.
Results demonstrate the ability of foundation models to surpass ImageNet-pretrained models for the prediction of six skin cancer subtypes.
- Score: 1.7641392161755438
- License:
- Abstract: Since the emergence of the ImageNet dataset, the pretraining and fine-tuning approach has become widely adopted in computer vision due to the ability of ImageNet-pretrained models to learn a wide variety of visual features. However, a significant challenge arises when adapting these models to domain-specific fields, such as digital pathology, due to substantial gaps between domains. To address this limitation, foundation models (FM) have been trained on large-scale in-domain datasets to learn the intricate features of histopathology images. In cancer diagnosis, whole-slide image (WSI) prediction is essential for patient prognosis, and multiple instance learning (MIL) has been implemented to handle the giga-pixel size of WSI. As MIL frameworks rely on patch-level feature aggregation, this work aims to compare the performance of various feature extractors developed under different pretraining strategies for cancer subtyping on WSI under a MIL framework. Results demonstrate the ability of foundation models to surpass ImageNet-pretrained models for the prediction of six skin cancer subtypes
Related papers
- ShapeMamba-EM: Fine-Tuning Foundation Model with Local Shape Descriptors and Mamba Blocks for 3D EM Image Segmentation [49.42525661521625]
This paper presents ShapeMamba-EM, a specialized fine-tuning method for 3D EM segmentation.
It is tested over a wide range of EM images, covering five segmentation tasks and 10 datasets.
arXiv Detail & Related papers (2024-08-26T08:59:22Z) - Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data Perspective [32.93871326428446]
Recent advances in artificial intelligence (AI) are revolutionizing medical imaging and computational pathology.
A constant challenge in the analysis of digital Whole Slide Images (WSIs) is the problem of aggregating tens of thousands of tile-level image embeddings to a slide-level representation.
This study conducts a benchmarking analysis of ten slide-level aggregation techniques across nine clinically relevant tasks.
arXiv Detail & Related papers (2024-07-10T17:00:57Z) - DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple but effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate model hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - Generative Medical Segmentation [5.4613210257624605]
Generative Medical (GMS) is a novel approach leveraging a generative model to perform image segmentation.
GMS employs a robust pre-trained vision foundation model to extract latent representations for images and corresponding ground truth masks.
The design of GMS leads to fewer trainable parameters in the model which reduces the risk of overfitting and enhances its capability.
arXiv Detail & Related papers (2024-03-27T02:16:04Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Prompt-Guided Adaptive Model Transformation for Whole Slide Image Classification [27.21493446754789]
Multiple instance learning (MIL) has emerged as a popular method for classifying histopathology whole slide images (WSIs)
We propose Prompt-guided Adaptive Model Transformation framework that seamlessly adapts pre-trained models to the specific characteristics of histopathology data.
We rigorously evaluate our approach on two datasets, Camelyon16 and TCGA-NSCLC, showcasing substantial improvements across various MIL models.
arXiv Detail & Related papers (2024-03-19T08:23:12Z) - A self-supervised framework for learning whole slide representations [52.774822784847565]
We present Slide Pre-trained Transformers (SPT) for gigapixel-scale self-supervision of whole slide images.
We benchmark SPT visual representations on five diagnostic tasks across three biomedical microscopy datasets.
arXiv Detail & Related papers (2024-02-09T05:05:28Z) - GRASP: GRAph-Structured Pyramidal Whole Slide Image Representation [4.5869791542071]
We present GRASP, a graph-structured multi-magnification framework for processing whole slide images (WSIs) in digital pathology.
Our approach is designed to emulate the pathologist's behavior in handling WSIs and benefits from the hierarchical structure of WSIs.
GRASP, which introduces a convergence-based node aggregation instead of traditional pooling mechanisms, outperforms state-of-the-art methods over two distinct cancer datasets.
arXiv Detail & Related papers (2024-02-06T00:03:44Z) - Domain-Specific Pre-training Improves Confidence in Whole Slide Image
Classification [15.354256205808273]
Whole Slide Images (WSIs) or histopathology images are used in digital pathology.
WSIs pose great challenges to deep learning models for clinical diagnosis.
arXiv Detail & Related papers (2023-02-20T08:42:06Z) - Hierarchical Transformer for Survival Prediction Using Multimodality
Whole Slide Images and Genomics [63.76637479503006]
Learning good representation of giga-pixel level whole slide pathology images (WSI) for downstream tasks is critical.
This paper proposes a hierarchical-based multimodal transformer framework that learns a hierarchical mapping between pathology images and corresponding genes.
Our architecture requires fewer GPU resources compared with benchmark methods while maintaining better WSI representation ability.
arXiv Detail & Related papers (2022-11-29T23:47:56Z) - Domain Shift in Computer Vision models for MRI data analysis: An
Overview [64.69150970967524]
Machine learning and computer vision methods are showing good performance in medical imagery analysis.
Yet only a few applications are now in clinical use.
Poor transferability of themodels to data from different sources or acquisition domains is one of the reasons for that.
arXiv Detail & Related papers (2020-10-14T16:34:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.