SAM-UNet:Enhancing Zero-Shot Segmentation of SAM for Universal Medical Images
- URL: http://arxiv.org/abs/2408.09886v1
- Date: Mon, 19 Aug 2024 11:01:00 GMT
- Title: SAM-UNet:Enhancing Zero-Shot Segmentation of SAM for Universal Medical Images
- Authors: Sihan Yang, Haixia Bi, Hai Zhang, Jian Sun,
- Abstract summary: Segment Anything Model (SAM) has demonstrated impressive performance on a wide range of natural image segmentation tasks.
We propose SAMUNet, a new foundation model which incorporates U-Net to the original SAM, to fully leverage the powerful contextual modeling ability of convolutions.
We train SAM-UNet on SA-Med2D-16M, the largest 2-dimensional medical image segmentation dataset to date, yielding a universal pretrained model for medical images.
- Score: 40.4422523499489
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Segment Anything Model (SAM) has demonstrated impressive performance on a wide range of natural image segmentation tasks. However, its performance significantly deteriorates when directly applied to medical domain, due to the remarkable differences between natural images and medical images. Some researchers have attempted to train SAM on large scale medical datasets. However, poor zero-shot performance is observed from the experimental results. In this context, inspired by the superior performance of U-Net-like models in medical image segmentation, we propose SAMUNet, a new foundation model which incorporates U-Net to the original SAM, to fully leverage the powerful contextual modeling ability of convolutions. To be specific, we parallel a convolutional branch in the image encoder, which is trained independently with the vision Transformer branch frozen. Additionally, we employ multi-scale fusion in the mask decoder, to facilitate accurate segmentation of objects with different scales. We train SAM-UNet on SA-Med2D-16M, the largest 2-dimensional medical image segmentation dataset to date, yielding a universal pretrained model for medical images. Extensive experiments are conducted to evaluate the performance of the model, and state-of-the-art result is achieved, with a dice similarity coefficient score of 0.883 on SA-Med2D-16M dataset. Specifically, in zero-shot segmentation experiments, our model not only significantly outperforms previous large medical SAM models across all modalities, but also substantially mitigates the performance degradation seen on unseen modalities. It should be highlighted that SAM-UNet is an efficient and extensible foundation model, which can be further fine-tuned for other downstream tasks in medical community. The code is available at https://github.com/Hhankyangg/sam-unet.
Related papers
- SAM-Guided Robust Representation Learning for One-Shot 3D Medical Image Segmentation [14.786629295011727]
One-shot medical image segmentation (MIS) is crucial for medical analysis due to the burden of medical experts on manual annotation.
Recent emergence of the segment anything model (SAM) has demonstrated remarkable adaptation in MIS but cannot be directly applied to one-shot medical image segmentation (MIS)
We propose a novel SAM-guided robust representation learning framework, named RRL-MedSAM, to adapt SAM to one-shot 3D MIS.
arXiv Detail & Related papers (2025-04-29T07:43:37Z) - DC-SAM: In-Context Segment Anything in Images and Videos via Dual Consistency [91.30252180093333]
We propose the Dual Consistency SAM (DCSAM) method based on prompttuning to adapt SAM and SAM2 for in-context segmentation.
Our key insights are to enhance the features of the SAM's prompt encoder in segmentation by providing high-quality visual prompts.
Although the proposed DC-SAM is primarily designed for images, it can be seamlessly extended to the video domain with the support SAM2.
arXiv Detail & Related papers (2025-04-16T13:41:59Z) - SynthFM: Training Modality-agnostic Foundation Models for Medical Image Segmentation without Real Medical Data [0.5242869847419834]
Foundation models like the Segment Anything Model (SAM) excel in zero-shot segmentation for natural images.
But they struggle with medical image segmentation due to differences in texture, contrast, and noise.
Annotating medical images is costly and requires domain expertise, limiting large-scale annotated data availability.
We propose SynthFM, a synthetic data generation framework that mimics the complexities of medical images.
arXiv Detail & Related papers (2025-04-11T00:14:28Z) - SAM-I2I: Unleash the Power of Segment Anything Model for Medical Image Translation [0.9626666671366836]
We propose SAM-I2I, a novel image-to-image translation framework based on the Segment Anything Model 2 (SAM2).
Our experiments on multi-contrast MRI datasets demonstrate that SAM-I2I outperforms state-of-the-art methods, offering more efficient and accurate medical image translation.
arXiv Detail & Related papers (2024-11-13T03:30:10Z) - DB-SAM: Delving into High Quality Universal Medical Image Segmentation [100.63434169944853]
We propose a dual-branch adapted SAM framework, named DB-SAM, to bridge the gap between natural and 2D/3D medical data.
Our proposed DB-SAM achieves an absolute gain of 8.8%, compared to a recent medical SAM adapter in the literature.
arXiv Detail & Related papers (2024-10-05T14:36:43Z) - Do Vision Foundation Models Enhance Domain Generalization in Medical Image Segmentation? [10.20366295974822]
We introduce a novel decode head architecture, HQHSAM, which simply integrates elements from two state-of-the-art decoder heads, HSAM and HQSAM, to enhance segmentation performance.
Our experiments on multiple datasets, encompassing various anatomies and modalities, reveal that FMs, particularly with the HQHSAM decode head, improve domain generalization for medical image segmentation.
arXiv Detail & Related papers (2024-09-12T11:41:35Z) - PMT: Progressive Mean Teacher via Exploring Temporal Consistency for Semi-Supervised Medical Image Segmentation [51.509573838103854]
We propose a semi-supervised learning framework, termed Progressive Mean Teachers (PMT), for medical image segmentation.
Our PMT generates high-fidelity pseudo labels by learning robust and diverse features in the training process.
Experimental results on two datasets with different modalities, i.e., CT and MRI, demonstrate that our method outperforms the state-of-the-art medical image segmentation approaches.
arXiv Detail & Related papers (2024-09-08T15:02:25Z) - SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation [51.90445260276897]
We prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models.
We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation.
arXiv Detail & Related papers (2024-08-16T17:55:38Z) - Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation [52.172885882728174]
In medical imaging contexts, it is not uncommon for human experts to rectify segmentations of specific test samples after SAM generates its segmentation predictions.
We introduce a novel approach that leverages the advantages of online machine learning to enhance Segment Anything (SA) during test time.
We employ rectified annotations to perform online learning, with the aim of improving the segmentation quality of SA on medical images.
arXiv Detail & Related papers (2024-06-03T03:16:25Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - Unleashing the Potential of SAM for Medical Adaptation via Hierarchical Decoding [15.401507589312702]
This paper introduces H-SAM, a prompt-free adaptation of the Segment Anything Model (SAM) for efficient fine-tuning of medical images.
In the initial stage, H-SAM employs SAM's original decoder to generate a prior probabilistic mask, guiding a more intricate decoding process.
Our H-SAM demonstrates a 4.78% improvement in average Dice compared to existing prompt-free SAM variants.
arXiv Detail & Related papers (2024-03-27T05:55:16Z) - Comprehensive Multimodal Segmentation in Medical Imaging: Combining
YOLOv8 with SAM and HQ-SAM Models [0.24578723416255752]
The proposed method harnesses the capabilities of the YOLOv8 model for approximate boundary box detection across modalities.
To generate boundary boxes, the YOLOv8 model was trained using a limited set of 100 images and masks from each modality.
A comparative analysis was conducted to assess the individual and combined performance of the YOLOv8, YOLOv8+SAM, and YOLOv8+HQ-SAM models.
arXiv Detail & Related papers (2023-10-04T20:30:49Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - SAM-Med2D [34.82072231983896]
We introduce SAM-Med2D, the most comprehensive studies on applying SAM to medical 2D images.
We first collect and curate approximately 4.6M images and 19.7M masks from public and private datasets.
We fine-tune the encoder and decoder of the original SAM to obtain a well-performed SAM-Med2D.
arXiv Detail & Related papers (2023-08-30T17:59:02Z) - Customized Segment Anything Model for Medical Image Segmentation [10.933449793055313]
We build upon the large-scale image segmentation model, Segment Anything Model (SAM), to explore the new research paradigm of customizing large-scale models for medical image segmentation.
SAMed applies the low-rank-based (LoRA) finetuning strategy to the SAM image encoder and finetunes it together with the prompt encoder and the mask decoder on labeled medical image segmentation datasets.
Our trained SAMed model achieves semantic segmentation on medical images, which is on par with the state-of-the-art methods.
arXiv Detail & Related papers (2023-04-26T19:05:34Z) - Medical SAM Adapter: Adapting Segment Anything Model for Medical Image
Segmentation [51.770805270588625]
The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation.
Recent studies and individual experiments have shown that SAM underperforms in medical image segmentation.
We propose the Medical SAM Adapter (Med-SA), which incorporates domain-specific medical knowledge into the segmentation model.
arXiv Detail & Related papers (2023-04-25T07:34:22Z) - Input Augmentation with SAM: Boosting Medical Image Segmentation with
Segmentation Foundation Model [36.015065439244495]
The Segment Anything Model (SAM) is a recently developed large model for general-purpose segmentation for computer vision tasks.
SAM was trained using 11 million images with over 1 billion masks and can produce segmentation results for a wide range of objects in natural scene images.
This paper shows that although SAM does not immediately give high-quality segmentation for medical image data, its generated masks, features, and stability scores are useful for building and training better medical image segmentation models.
arXiv Detail & Related papers (2023-04-22T07:11:53Z) - Segment Anything Model for Medical Image Analysis: an Experimental Study [19.95972201734614]
Segment Anything Model (SAM) is a foundation model that is intended to segment user-defined objects of interest in an interactive manner.
We evaluate SAM's ability to segment medical images on a collection of 19 medical imaging datasets from various modalities and anatomies.
arXiv Detail & Related papers (2023-04-20T17:50:18Z) - TransUNet: Transformers Make Strong Encoders for Medical Image
Segmentation [78.01570371790669]
Medical image segmentation is an essential prerequisite for developing healthcare systems.
On various medical image segmentation tasks, the u-shaped architecture, also known as U-Net, has become the de-facto standard.
We propose TransUNet, which merits both Transformers and U-Net, as a strong alternative for medical image segmentation.
arXiv Detail & Related papers (2021-02-08T16:10:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.