Temporal-spatial Adaptation of Promptable SAM Enhance Accuracy and Generalizability of cine CMR Segmentation
- URL: http://arxiv.org/abs/2403.10009v2
- Date: Mon, 15 Jul 2024 18:22:28 GMT
- Title: Temporal-spatial Adaptation of Promptable SAM Enhance Accuracy and Generalizability of cine CMR Segmentation
- Authors: Zhennong Chen, Sekeun Kim, Hui Ren, Quanzheng Li, Xiang Li,
- Abstract summary: We propose cineCMR-SAM, which incorporates both temporal and spatial information through a modified model architecture.
Our model achieved superior data-specific model segmentation accuracy on the STACOM2011 dataset.
We introduce a text prompt feature in cineCMR-SAM to specify the view type of input slices.
- Score: 10.911758824265736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurate myocardium segmentation across all phases in one cardiac cycle in cine cardiac magnetic resonance (CMR) scans is crucial for comprehensively cardiac function analysis. Despite advancements in deep learning (DL) for automatic cine CMR segmentation, generalizability on unseen data remains a significant challenge. Recently, the segment-anything-model (SAM) has been invented as a segmentation foundation model, known for its accurate segmentation and more importantly, zero-shot generalization. SAM was trained on two-dimensional (2D) natural images; to adapt it for comprehensive cine CMR segmentation, we propose cineCMR-SAM which incorporates both temporal and spatial information through a modified model architecture. Compared to other state-of-the-art (SOTA) methods, our model achieved superior data-specific model segmentation accuracy on the STACOM2011 when fine-tuned on this dataset and demonstrated superior zero-shot generalization on two other large public datasets (ACDC and M&Ms) unseen during fine-tuning. Additionally, we introduced a text prompt feature in cineCMR-SAM to specify the view type of input slices (short-axis or long-axis), enhancing performance across all view types.
Related papers
- Atlas-Assisted Segment Anything Model for Fetal Brain MRI (FeTal-SAM) [14.57158980216096]
FeTal-SAM is a novel adaptation of the Segment Anything Model (SAM) tailored for fetal brain MRI segmentation.<n>By integrating atlas-based prompts and foundation-model principles, FeTal-SAM addresses two key limitations in fetal brain MRI segmentation.
arXiv Detail & Related papers (2026-01-22T08:49:33Z) - Enabling Ultra-Fast Cardiovascular Imaging Across Heterogeneous Clinical Environments with a Generalist Foundation Model and Multimodal Database [64.65360708629485]
MMCMR-427K is the largest and most comprehensive multimodal cardiovascular magnetic resonance k-space database.<n> CardioMM is a reconstruction foundation model capable of adapting to heterogeneous fast CMR imaging scenarios.<n> CardioMM unifies semantic contextual understanding with physics-informed data consistency to deliver robust reconstructions.
arXiv Detail & Related papers (2025-12-25T12:47:50Z) - VesSAM: Efficient Multi-Prompting for Segmenting Complex Vessel [68.24765319399286]
We present VesSAM, a powerful and efficient framework tailored for 2D vessel segmentation.<n>VesSAM integrates (1) a convolutional adapter to enhance local texture features, (2) a multi-prompt encoder that fuses anatomical prompts, and (3) a lightweight mask decoder to reduce jagged artifacts.<n>VesSAM consistently outperforms state-of-the-art PEFT-based SAM variants by over 10% Dice and 13% IoU.
arXiv Detail & Related papers (2025-11-02T15:47:05Z) - Extreme Cardiac MRI Analysis under Respiratory Motion: Results of the CMRxMotion Challenge [56.28872161153236]
Deep learning models have achieved state-of-the-art performance in automated Cardiac Magnetic Resonance (CMR) analysis.<n>The efficacy of these models is highly dependent on the availability of high-quality, artifact-free images.<n>To promote research in this domain, we organized the MICCAI CMRxMotion challenge.
arXiv Detail & Related papers (2025-07-25T11:12:21Z) - U-R-VEDA: Integrating UNET, Residual Links, Edge and Dual Attention, and Vision Transformer for Accurate Semantic Segmentation of CMRs [0.0]
We propose a deep learning based enhanced UNet model, U-R-Veda, which integrates convolution transformations, vision transformer, residual links, channelattention, and spatial attention.<n>The model significantly improves the semantic segmentation of cardiac magnetic resonance (CMR) images.<n>Performance results show that U-R-Veda achieves an average accuracy of 95.2%, based on DSC.
arXiv Detail & Related papers (2025-06-25T04:10:09Z) - Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2 [10.279314732888079]
Short-Long Memory SAM 2 (SLM-SAM 2) is a novel architecture that integrates distinct short-term and long-term memory banks to improve segmentation accuracy.<n>We evaluate SLM-SAM 2 on three public datasets covering organs, bones, and muscles across MRI and CT modalities.
arXiv Detail & Related papers (2025-05-03T16:16:24Z) - SAM-Guided Robust Representation Learning for One-Shot 3D Medical Image Segmentation [14.786629295011727]
One-shot medical image segmentation (MIS) is crucial for medical analysis due to the burden of medical experts on manual annotation.
Recent emergence of the segment anything model (SAM) has demonstrated remarkable adaptation in MIS but cannot be directly applied to one-shot medical image segmentation (MIS)
We propose a novel SAM-guided robust representation learning framework, named RRL-MedSAM, to adapt SAM to one-shot 3D MIS.
arXiv Detail & Related papers (2025-04-29T07:43:37Z) - RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation [70.79072961974141]
We propose RWKV-UNet, a novel model that integrates the RWKV structure into the U-Net architecture.<n>This integration enhances the model's ability to capture long-range dependencies and to improve contextual understanding.<n> Experiments on 11 benchmark datasets show that the RWKV-UNet achieves state-of-the-art performance on various types of medical image segmentation tasks.
arXiv Detail & Related papers (2025-01-14T22:03:00Z) - PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model [76.95536611263356]
PolSAR data presents unique challenges due to its rich and complex characteristics.
Existing data representations, such as complex-valued data, polarimetric features, and amplitude images, are widely used.
Most feature extraction networks for PolSAR are small, limiting their ability to capture features effectively.
We propose the Polarimetric Scattering Mechanism-Informed SAM (PolSAM), an enhanced Segment Anything Model (SAM) that integrates domain-specific scattering characteristics and a novel prompt generation strategy.
arXiv Detail & Related papers (2024-12-17T09:59:53Z) - Assessing Foundational Medical 'Segment Anything' (Med-SAM1, Med-SAM2) Deep Learning Models for Left Atrial Segmentation in 3D LGE MRI [0.0]
Atrial fibrillation (AF) is the most common cardiac arrhythmia and is associated with heart failure and stroke.
Deep learning models, such as the Segment Anything Model (SAM), pre-trained on diverse datasets, have demonstrated promise in generic segmentation tasks.
Despite the potential of MedSAM model, it has not yet been evaluated for the complex task of LA segmentation in 3D LGE-MRI.
arXiv Detail & Related papers (2024-11-08T20:49:54Z) - Continuous Spatio-Temporal Memory Networks for 4D Cardiac Cine MRI Segmentation [5.542949496418442]
We propose a continuous STM (CSTM) network for semi-supervised whole heart and whole sequence cMR segmentation.
Our network takes full advantage of the spatial, scale temporal and through-plane continuity prior to the underlying heart structures, to achieve accurate and fast 4D segmentation.
arXiv Detail & Related papers (2024-10-30T16:45:59Z) - MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation [2.2585213273821716]
We introduce MedCLIP-SAMv2, a novel framework that integrates the CLIP and SAM models to perform segmentation on clinical scans.
Our approach includes fine-tuning the BiomedCLIP model with a new Decoupled Hard Negative Noise Contrastive Estimation (DHN-NCE) loss.
We also investigate using zero-shot segmentation labels within a weakly supervised paradigm to enhance segmentation quality further.
arXiv Detail & Related papers (2024-09-28T23:10:37Z) - Novel adaptation of video segmentation to 3D MRI: efficient zero-shot knee segmentation with SAM2 [1.6237741047782823]
We introduce a method for zero-shot, single-prompt segmentation of 3D knee MRI by adapting Segment Anything Model 2.
By treating slices from 3D medical volumes as individual video frames, we leverage SAM2's advanced capabilities to generate motion- and spatially-aware predictions.
We demonstrate that SAM2 can efficiently perform segmentation tasks in a zero-shot manner with no additional training or fine-tuning.
arXiv Detail & Related papers (2024-08-08T21:39:15Z) - ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Segment anything model for head and neck tumor segmentation with CT, PET
and MRI multi-modality images [0.04924932828166548]
This study investigates the Segment Anything Model (SAM), recognized for requiring minimal human prompting.
We specifically examine MedSAM, a version of SAM fine-tuned with large-scale public medical images.
Our study demonstrates that fine-tuning SAM significantly enhances its segmentation accuracy, building upon the already effective zero-shot results.
arXiv Detail & Related papers (2024-02-27T12:26:45Z) - Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation [63.15257949821558]
Referring Remote Sensing Image (RRSIS) is a new challenge that combines computer vision and natural language processing.
Traditional Referring Image (RIS) approaches have been impeded by the complex spatial scales and orientations found in aerial imagery.
We introduce the Rotated Multi-Scale Interaction Network (RMSIN), an innovative approach designed for the unique demands of RRSIS.
arXiv Detail & Related papers (2023-12-19T08:14:14Z) - CMRxRecon: An open cardiac MRI dataset for the competition of
accelerated image reconstruction [62.61209705638161]
There has been growing interest in deep learning-based CMR imaging algorithms.
Deep learning methods require large training datasets.
This dataset includes multi-contrast, multi-view, multi-slice and multi-coil CMR imaging data from 300 subjects.
arXiv Detail & Related papers (2023-09-19T15:14:42Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - 3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation [52.699139151447945]
We propose a novel adaptation method for transferring the segment anything model (SAM) from 2D to 3D for promptable medical image segmentation.
Our model can outperform domain state-of-the-art medical image segmentation models on 3 out of 4 tasks, specifically by 8.25%, 29.87%, and 10.11% for kidney tumor, pancreas tumor, colon cancer segmentation, and achieve similar performance for liver tumor segmentation.
arXiv Detail & Related papers (2023-06-23T12:09:52Z) - Improving Classification Model Performance on Chest X-Rays through Lung
Segmentation [63.45024974079371]
We propose a deep learning approach to enhance abnormal chest x-ray (CXR) identification performance through segmentations.
Our approach is designed in a cascaded manner and incorporates two modules: a deep neural network with criss-cross attention modules (XLSor) for localizing lung region in CXR images and a CXR classification model with a backbone of a self-supervised momentum contrast (MoCo) model pre-trained on large-scale CXR data sets.
arXiv Detail & Related papers (2022-02-22T15:24:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.