Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything Model
- URL: http://arxiv.org/abs/2508.09645v1
- Date: Wed, 13 Aug 2025 09:25:29 GMT
- Title: Multi-Sequence Parotid Gland Lesion Segmentation via Expert Text-Guided Segment Anything Model
- Authors: Zhongyuan Wu, Chuan-Xian Ren, Yu Wang, Xiaohua Ban, Jianning Xiao, Xiaohui Duan,
- Abstract summary: The parotid gland segment anything model (PG-SAM) incorporates expert domain knowledge for cross-sequence parotid gland lesion segmentation.<n>The PG-SAM achieves state-of-the-art performance in parotid gland lesion segmentation across three independent clinical centers.
- Score: 10.892625199593644
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parotid gland lesion segmentation is essential for the treatment of parotid gland diseases. However, due to the variable size and complex lesion boundaries, accurate parotid gland lesion segmentation remains challenging. Recently, the Segment Anything Model (SAM) fine-tuning has shown remarkable performance in the field of medical image segmentation. Nevertheless, SAM's interaction segmentation model relies heavily on precise lesion prompts (points, boxes, masks, etc.), which are very difficult to obtain in real-world applications. Besides, current medical image segmentation methods are automatically generated, ignoring the domain knowledge of medical experts when performing segmentation. To address these limitations, we propose the parotid gland segment anything model (PG-SAM), an expert diagnosis text-guided SAM incorporating expert domain knowledge for cross-sequence parotid gland lesion segmentation. Specifically, we first propose an expert diagnosis report guided prompt generation module that can automatically generate prompt information containing the prior domain knowledge to guide the subsequent lesion segmentation process. Then, we introduce a cross-sequence attention module, which integrates the complementary information of different modalities to enhance the segmentation effect. Finally, the multi-sequence image features and generated prompts are feed into the decoder to get segmentation result. Experimental results demonstrate that PG-SAM achieves state-of-the-art performance in parotid gland lesion segmentation across three independent clinical centers, validating its clinical applicability and the effectiveness of diagnostic text for enhancing image segmentation in real-world clinical settings.
Related papers
- TGC-Net: A Structure-Aware and Semantically-Aligned Framework for Text-Guided Medical Image Segmentation [56.09179939570486]
We propose TGC-Net, a CLIP-based framework focusing on parameter-efficient, task-specific adaptations.<n>TGC-Net achieves state-of-the-art performance with substantially fewer trainable parameters, including notable Dice gains on challenging benchmarks.
arXiv Detail & Related papers (2025-12-24T12:06:26Z) - Dynamically evolving segment anything model with continuous learning for medical image segmentation [50.92344083895528]
We introduce EvoSAM, a dynamically evolving medical image segmentation model.<n>EvoSAM continuously accumulates new knowledge from an ever-expanding array of scenarios and tasks.<n>Experiments conducted by surgical clinicians on blood vessel segmentation confirm that EvoSAM enhances segmentation efficiency based on user prompts.
arXiv Detail & Related papers (2025-03-08T14:37:52Z) - MGFI-Net: A Multi-Grained Feature Integration Network for Enhanced Medical Image Segmentation [0.3108011671896571]
A major challenge in medical image segmentation is achieving accurate delineation of regions of interest in the presence of noise, low contrast, or complex anatomical structures.<n>Existing segmentation models often neglect the integration of multi-grained information and fail to preserve edge details.<n>We propose a novel image semantic segmentation model called the Multi-Grained Feature Integration Network (MGFI-Net)<n>Our MGFI-Net is designed with two dedicated modules to tackle these issues.
arXiv Detail & Related papers (2025-02-19T15:24:34Z) - MedCLIP-SAMv2: Towards Universal Text-Driven Medical Image Segmentation [2.2585213273821716]
We introduce MedCLIP-SAMv2, a novel framework that integrates the CLIP and SAM models to perform segmentation on clinical scans.<n>Our approach includes fine-tuning the BiomedCLIP model with a new Decoupled Hard Negative Noise Contrastive Estimation (DHN-NCE) loss.<n>We also investigate using zero-shot segmentation labels within a weakly supervised paradigm to enhance segmentation quality further.
arXiv Detail & Related papers (2024-09-28T23:10:37Z) - MedCLIP-SAM: Bridging Text and Image Towards Universal Medical Image Segmentation [2.2585213273821716]
We propose a novel framework, called MedCLIP-SAM, that combines CLIP and SAM models to generate segmentation of clinical scans.
By extensively testing three diverse segmentation tasks and medical image modalities, our proposed framework has demonstrated excellent accuracy.
arXiv Detail & Related papers (2024-03-29T15:59:11Z) - Mask-Enhanced Segment Anything Model for Tumor Lesion Semantic Segmentation [48.107348956719775]
We introduce Mask-Enhanced SAM (M-SAM), an innovative architecture tailored for 3D tumor lesion segmentation.
We propose a novel Mask-Enhanced Adapter (MEA) within M-SAM that enriches the semantic information of medical images with positional data from coarse segmentation masks.
Our M-SAM achieves high segmentation accuracy and also exhibits robust generalization.
arXiv Detail & Related papers (2024-03-09T13:37:02Z) - CARE: A Large Scale CT Image Dataset and Clinical Applicable Benchmark
Model for Rectal Cancer Segmentation [8.728236864462302]
Rectal cancer segmentation of CT image plays a crucial role in timely clinical diagnosis, radiotherapy treatment, and follow-up.
These obstacles arise from the intricate anatomical structures of the rectum and the difficulties in performing differential diagnosis of rectal cancer.
To address these issues, this work introduces a novel large scale rectal cancer CT image dataset CARE with pixel-level annotations for both normal and cancerous rectum.
We also propose a novel medical cancer lesion segmentation benchmark model named U-SAM.
The model is specifically designed to tackle the challenges posed by the intricate anatomical structures of abdominal organs by incorporating prompt information.
arXiv Detail & Related papers (2023-08-16T10:51:27Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - Reliable Joint Segmentation of Retinal Edema Lesions in OCT Images [55.83984261827332]
In this paper, we propose a novel reliable multi-scale wavelet-enhanced transformer network.
We develop a novel segmentation backbone that integrates a wavelet-enhanced feature extractor network and a multi-scale transformer module.
Our proposed method achieves better segmentation accuracy with a high degree of reliability as compared to other state-of-the-art segmentation approaches.
arXiv Detail & Related papers (2022-12-01T07:32:56Z) - Anatomy-guided Multimodal Registration by Learning Segmentation without
Ground Truth: Application to Intraprocedural CBCT/MR Liver Segmentation and
Registration [12.861503169117208]
Multimodal image registration has many applications in diagnostic medical imaging and image-guided interventions.
The ability to register peri-procedurally acquired diagnostic images into the intraprocedural environment can potentially improve the intra-procedural tumor targeting.
We propose an anatomy-preserving domain adaptation to segmentation network (APA2Seg-Net) for learning segmentation without target modality ground truth.
arXiv Detail & Related papers (2021-04-14T18:07:03Z) - Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions.
Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures.
The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.