Related papers: Adapting a Segmentation Foundation Model for Medical Image Classification

Adapting a Segmentation Foundation Model for Medical Image Classification

URL: http://arxiv.org/abs/2505.06217v1
Date: Fri, 09 May 2025 17:51:51 GMT
Title: Adapting a Segmentation Foundation Model for Medical Image Classification
Authors: Pengfei Gu, Haoteng Tang, Islam A. Ebeid, Jose A. Nunez, Fabian Vazquez, Diego Adame, Marcus Zhan, Huimin Li, Bin Fu, Danny Z. Chen,
Abstract summary: We introduce a new framework to adapt the Segment Anything Model (SAM) for medical image classification.<n>First, we utilize the SAM image encoder as a feature extractor to capture segmentation-based features.<n>Next, we propose a novel Spatially Localized Channel Attention (SLCA) mechanism to compute spatially localized attention weights for the feature maps.
Score: 13.711279542090043
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Recent advancements in foundation models, such as the Segment Anything Model (SAM), have shown strong performance in various vision tasks, particularly image segmentation, due to their impressive zero-shot segmentation capabilities. However, effectively adapting such models for medical image classification is still a less explored topic. In this paper, we introduce a new framework to adapt SAM for medical image classification. First, we utilize the SAM image encoder as a feature extractor to capture segmentation-based features that convey important spatial and contextual details of the image, while freezing its weights to avoid unnecessary overhead during training. Next, we propose a novel Spatially Localized Channel Attention (SLCA) mechanism to compute spatially localized attention weights for the feature maps. The features extracted from SAM's image encoder are processed through SLCA to compute attention weights, which are then integrated into deep learning classification models to enhance their focus on spatially relevant or meaningful regions of the image, thus improving classification performance. Experimental results on three public medical image classification datasets demonstrate the effectiveness and data-efficiency of our approach.

Related papers

SAMIR, an efficient registration framework via robust feature learning from SAM [40.09295562721889]
This paper introduces SAMIR, an efficient medical image registration framework.<n>SAM is pretrained on large-scale natural image datasets and can learn robust, general-purpose visual representations.<n>We show that SAMIR significantly outperforms state-of-the-art methods on benchmark datasets for both intra-subject cardiac image registration and inter-subject abdomen CT image registration.
arXiv Detail & Related papers (2025-09-17T01:56:35Z)
PathSegDiff: Pathology Segmentation using Diffusion model representations [63.20694440934692]
We propose PathSegDiff, a novel approach for histopathology image segmentation that leverages Latent Diffusion Models (LDMs) as pre-trained featured extractors.<n>Our method utilizes a pathology-specific LDM, guided by a self-supervised encoder, to extract rich semantic information from H&E stained histopathology images.<n>Our experiments demonstrate significant improvements over traditional methods on the BCSS and GlaS datasets.
arXiv Detail & Related papers (2025-04-09T14:58:21Z)
MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention [1.2277343096128712]
We propose to leverage advanced segmentation capabilities of Segment Anything Model 2 (SAM2) as a visual prompting cue to help visual encoder in the CLIP (Contrastive Language-Image Pretraining)<n>This helps the model to focus on highly discriminative regions, without getting distracted from visually similar background features.<n>We evaluate our method on diverse medical datasets including X-rays, CT scans, and MRI images, and report an accuracy of (71%, 81%, 86%, 58%) from the proposed approach.
arXiv Detail & Related papers (2025-01-07T14:49:12Z)
Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation [3.6723640056915436]
We propose the Class-Aware Semantic Diffusion Model (CASDM) to tackle data scarcity and imbalance. Class-aware mean squared error and class-aware self-perceptual loss functions have been defined to prioritize critical, less visible classes. We are the first to generate multi-class segmentation maps using text prompts in a novel fashion to specify their contents.
arXiv Detail & Related papers (2024-10-31T14:14:30Z)
Boosting Medical Image Classification with Segmentation Foundation Model [19.41887842350247]
The Segment Anything Model (SAM) exhibits impressive capabilities in zero-shot segmentation for natural images. No studies have shown how to harness the power of SAM for medical image classification. We introduce SAMAug-C, an innovative augmentation method based on SAM for augmenting classification datasets.
arXiv Detail & Related papers (2024-06-16T17:54:49Z)
Explanations of Classifiers Enhance Medical Image Segmentation via End-to-end Pre-training [37.11542605885003]
Medical image segmentation aims to identify and locate abnormal structures in medical images, such as chest radiographs, using deep neural networks. Our work collects explanations from well-trained classifiers to generate pseudo labels of segmentation tasks. We then use Integrated Gradients (IG) method to distill and boost the explanations obtained from the classifiers, generating massive diagnosis-oriented localization labels (DoLL) These DoLL-annotated images are used for pre-training the model before fine-tuning it for downstream segmentation tasks, including COVID-19 infectious areas, lungs, heart, and clavicles.
arXiv Detail & Related papers (2024-01-16T16:18:42Z)
nnSAM: Plug-and-play Segment Anything Model Improves nnUNet Performance [12.169801149021566]
The Segment Anything Model (SAM) has emerged as a versatile tool for image segmentation without specific domain training. Traditional models like nnUNet perform automatic segmentation during inference but need extensive domain-specific training. We propose nnSAM, integrating SAM's robust feature extraction with nnUNet's automatic configuration to enhance segmentation accuracy on small datasets.
arXiv Detail & Related papers (2023-09-29T04:26:25Z)
Learnable Weight Initialization for Volumetric Medical Image Segmentation [66.3030435676252]
We propose a learnable weight-based hybrid medical image segmentation approach. Our approach is easy to integrate into any hybrid model and requires no external training data. Experiments on multi-organ and lung cancer segmentation tasks demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-06-15T17:55:05Z)
SAM.MD: Zero-shot medical image segmentation capabilities of the Segment Anything Model [1.1221592576472588]
We evaluate the zero-shot capabilities of the Segment Anything Model for medical image segmentation. We show that SAM generalizes well to CT data, making it a potential catalyst for the advancement of semi-automatic segmentation tools.
arXiv Detail & Related papers (2023-04-10T18:20:29Z)
Self-Supervised Correction Learning for Semi-Supervised Biomedical Image Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation. We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting. Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z)
A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model [61.58071099082296]
It is unclear how to make zero-shot recognition working well on broader vision problems, such as object detection and semantic segmentation. In this paper, we target for zero-shot semantic segmentation, by building it on an off-the-shelf pre-trained vision-language model, i.e., CLIP. Our experimental results show that this simple framework surpasses previous state-of-the-arts by a large margin.
arXiv Detail & Related papers (2021-12-29T18:56:18Z)
Few-shot Medical Image Segmentation using a Global Correlation Network with Discriminative Embedding [60.89561661441736]
We propose a novel method for few-shot medical image segmentation. We construct our few-shot image segmentor using a deep convolutional network trained episodically. We enhance discriminability of deep embedding to encourage clustering of the feature domains of the same class.
arXiv Detail & Related papers (2020-12-10T04:01:07Z)
Attention Model Enhanced Network for Classification of Breast Cancer Image [54.83246945407568]
AMEN is formulated in a multi-branch fashion with pixel-wised attention model and classification submodular. To focus more on subtle detail information, the sample image is enhanced by the pixel-wised attention map generated from former branch. Experiments conducted on three benchmark datasets demonstrate the superiority of the proposed method under various scenarios.
arXiv Detail & Related papers (2020-10-07T08:44:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.