SAM-FNet: SAM-Guided Fusion Network for Laryngo-Pharyngeal Tumor Detection
- URL: http://arxiv.org/abs/2408.05426v2
- Date: Thu, 15 Aug 2024 03:36:43 GMT
- Title: SAM-FNet: SAM-Guided Fusion Network for Laryngo-Pharyngeal Tumor Detection
- Authors: Jia Wei, Yun Li, Meiyu Qiu, Hongyu Chen, Xiaomao Fan, Wenbin Lei,
- Abstract summary: We propose a novel SAM-guided fusion network (SAM-FNet) for laryngo-pharyngeal tumor detection.
By leveraging the powerful object segmentation capabilities of the Segment Anything Model (SAM), we introduce the SAM into the SAM-FNet to accurately segment the lesion region.
Furthermore, we propose a GAN-like feature optimization (GFO) module to capture the discriminative features between the global and local branches.
- Score: 11.90977635214196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Laryngo-pharyngeal cancer (LPC) is a highly fatal malignant disease affecting the head and neck region. Previous studies on endoscopic tumor detection, particularly those leveraging dual-branch network architectures, have shown significant advancements in tumor detection. These studies highlight the potential of dual-branch networks in improving diagnostic accuracy by effectively integrating global and local (lesion) feature extraction. However, they are still limited in their capabilities to accurately locate the lesion region and capture the discriminative feature information between the global and local branches. To address these issues, we propose a novel SAM-guided fusion network (SAM-FNet), a dual-branch network for laryngo-pharyngeal tumor detection. By leveraging the powerful object segmentation capabilities of the Segment Anything Model (SAM), we introduce the SAM into the SAM-FNet to accurately segment the lesion region. Furthermore, we propose a GAN-like feature optimization (GFO) module to capture the discriminative features between the global and local branches, enhancing the fusion feature complementarity. Additionally, we collect two LPC datasets from the First Affiliated Hospital (FAHSYSU) and the Sixth Affiliated Hospital (SAHSYSU) of Sun Yat-sen University. The FAHSYSU dataset is used as the internal dataset for training the model, while the SAHSYSU dataset is used as the external dataset for evaluating the model's performance. Extensive experiments on both datasets of FAHSYSU and SAHSYSU demonstrate that the SAM-FNet can achieve competitive results, outperforming the state-of-the-art counterparts. The source code of SAM-FNet is available at the URL of https://github.com/VVJia/SAM-FNet.
Related papers
- From Specialist to Generalist: Unlocking SAM's Learning Potential on Unlabeled Medical Images [12.062960289184199]
We introduce SC-SAM, a specialist-generalist framework where U-Net provides point-based prompts and pseudo-labels to guide SAM's adaptation.<n>This reciprocal guidance forms a bidirectional co-training loop that allows both models to effectively exploit the unlabeled data.<n>Our method achieves state-of-the-art results, outperforming other existing semi-supervised SAM variants and even medical foundation models like MedSAM.
arXiv Detail & Related papers (2026-01-25T18:13:48Z) - VesSAM: Efficient Multi-Prompting for Segmenting Complex Vessel [68.24765319399286]
We present VesSAM, a powerful and efficient framework tailored for 2D vessel segmentation.<n>VesSAM integrates (1) a convolutional adapter to enhance local texture features, (2) a multi-prompt encoder that fuses anatomical prompts, and (3) a lightweight mask decoder to reduce jagged artifacts.<n>VesSAM consistently outperforms state-of-the-art PEFT-based SAM variants by over 10% Dice and 13% IoU.
arXiv Detail & Related papers (2025-11-02T15:47:05Z) - U-DFA: A Unified DINOv2-Unet with Dual Fusion Attention for Multi-Dataset Medical Segmentation [1.1724961392643483]
We propose U-DFA, a unified DINOv2-Unet encoder-decoder architecture that integrates a novel Local-Global Fusion Adapter (LGFA) to enhance segmentation performance.<n>Our method achieves state-of-the-art performance on the Synapse and ACDC datasets with only 33% of the trainable model parameters.
arXiv Detail & Related papers (2025-10-01T07:06:49Z) - Fully Automated SAM for Single-source Domain Generalization in Medical Image Segmentation [3.7839630649682054]
FA-SAM is a single-source domain generalization framework for medical image segmentation that achieves fully automated SAM.<n>FA-SAM introduces two key innovations: an Auto-prompted Generation Model (AGM) branch equipped with a Shallow Feature Uncertainty Modeling (SUFM) module, and an Image-Prompt Embedding Fusion (IPEF) module integrated into the SAM mask decoder.
arXiv Detail & Related papers (2025-07-23T07:37:39Z) - Performance Analysis of Deep Learning Models for Femur Segmentation in MRI Scan [5.5193366921929155]
We evaluate and compare the performance of three CNN-based models, i.e., U-Net, Attention U-Net, and U-KAN, and one transformer-based model, SAM 2.
The dataset comprises 11,164 MRI scans with detailed annotations of femoral regions.
Attention U-Net achieves the highest overall scores, while U-KAN demonstrated superior performance in anatomical regions with a smaller region of interest.
arXiv Detail & Related papers (2025-04-05T05:47:56Z) - BiSeg-SAM: Weakly-Supervised Post-Processing Framework for Boosting Binary Segmentation in Segment Anything Models [6.74659948545092]
BiSeg-SAM is a weakly supervised prompting and boundary refinement network for the segmentation of polyps and skin lesions.
Our method demonstrates significant superiority over state-of-the-art (SOTA) methods when tested on five polyp datasets and one skin cancer dataset.
arXiv Detail & Related papers (2025-04-02T08:04:37Z) - Learnable Prompting SAM-induced Knowledge Distillation for Semi-supervised Medical Image Segmentation [47.789013598970925]
We propose a learnable prompting SAM-induced Knowledge distillation framework (KnowSAM) for semi-supervised medical image segmentation.
Our model outperforms the state-of-the-art semi-supervised segmentation approaches.
arXiv Detail & Related papers (2024-12-18T11:19:23Z) - PolSAM: Polarimetric Scattering Mechanism Informed Segment Anything Model [76.95536611263356]
PolSAR data presents unique challenges due to its rich and complex characteristics.
Existing data representations, such as complex-valued data, polarimetric features, and amplitude images, are widely used.
Most feature extraction networks for PolSAR are small, limiting their ability to capture features effectively.
We propose the Polarimetric Scattering Mechanism-Informed SAM (PolSAM), an enhanced Segment Anything Model (SAM) that integrates domain-specific scattering characteristics and a novel prompt generation strategy.
arXiv Detail & Related papers (2024-12-17T09:59:53Z) - SAM-Swin: SAM-Driven Dual-Swin Transformers with Adaptive Lesion Enhancement for Laryngo-Pharyngeal Tumor Detection [12.86763797167925]
Laryngo-pharyngeal cancer (LPC) is a highly lethal malignancy in the head and neck region.
Recent advancements in tumor detection have significantly improved diagnostic accuracy by integrating global and local feature extraction.
We propose SAM-Swin, an innovative SAM-driven Dual-Swin Transformer for laryngo-pharyngeal tumor detection.
arXiv Detail & Related papers (2024-10-29T07:32:57Z) - DB-SAM: Delving into High Quality Universal Medical Image Segmentation [100.63434169944853]
We propose a dual-branch adapted SAM framework, named DB-SAM, to bridge the gap between natural and 2D/3D medical data.
Our proposed DB-SAM achieves an absolute gain of 8.8%, compared to a recent medical SAM adapter in the literature.
arXiv Detail & Related papers (2024-10-05T14:36:43Z) - ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Segment Anything Model-guided Collaborative Learning Network for
Scribble-supervised Polyp Segmentation [45.15517909664628]
Polyp segmentation plays a vital role in accurately locating polyps at an early stage.
pixel-wise annotation for polyp images by physicians during the diagnosis is both time-consuming and expensive.
We propose a novel SAM-guided Collaborative Learning Network (SAM-CLNet) for scribble-supervised polyp segmentation.
arXiv Detail & Related papers (2023-12-01T03:07:13Z) - Exploring SAM Ablations for Enhancing Medical Segmentation in Radiology
and Pathology [2.5462695047893025]
The Segment Anything Model (SAM) has emerged as a promising framework for addressing segmentation challenges across different domains.
We explore the fine-tuning of SAM and assess its profound impact on the accuracy and reliability of segmentation results.
We aim to bridge the gap between advanced segmentation techniques and the demanding requirements of healthcare.
arXiv Detail & Related papers (2023-09-30T21:58:12Z) - SAMedOCT: Adapting Segment Anything Model (SAM) for Retinal OCT [3.2495192768429924]
The Segment Anything Model (SAM) has gained significant attention in the field of image segmentation.
We conduct a comprehensive evaluation of SAM and its adaptations on a large-scale public dataset of OCTs from RETOUCH challenge.
We showcase adapted SAM's efficacy as a powerful segmentation model in retinal OCT scans, although still lagging behind established methods in some circumstances.
arXiv Detail & Related papers (2023-08-18T06:26:22Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - Federated Learning Enables Big Data for Rare Cancer Boundary Detection [98.5549882883963]
We present findings from the largest Federated ML study to-date, involving data from 71 healthcare institutions across 6 continents.
We generate an automatic tumor boundary detector for the rare disease of glioblastoma.
We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent.
arXiv Detail & Related papers (2022-04-22T17:27:00Z) - Cross-Modality Brain Tumor Segmentation via Bidirectional
Global-to-Local Unsupervised Domain Adaptation [61.01704175938995]
In this paper, we propose a novel Bidirectional Global-to-Local (BiGL) adaptation framework under a UDA scheme.
Specifically, a bidirectional image synthesis and segmentation module is proposed to segment the brain tumor.
The proposed method outperforms several state-of-the-art unsupervised domain adaptation methods by a large margin.
arXiv Detail & Related papers (2021-05-17T10:11:45Z) - Fader Networks for domain adaptation on fMRI: ABIDE-II study [68.5481471934606]
We use 3D convolutional autoencoders to build the domain irrelevant latent space image representation and demonstrate this method to outperform existing approaches on ABIDE data.
arXiv Detail & Related papers (2020-10-14T16:50:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.