FuguReport

Learning from Noisy Prompts: Saliency-Guided Prompt Distillation for Robust Segmentation with SAM

Authors Jingxuan Kang, Ziqi Zhang, Shaoming Zheng, Shuang Li, Uday Bharat Patel, Alexander Harry Fitzhugh, Phillip Lung, Yusuf Kiberu, Nikesh Jathanna, Shahnaz Jamil-Copley, Bernhard Kainz, Chen Qin
Affiliations National Health Service / Imperial College London / The University of Nottingham / Beihang University
Categories Method / Prompt Tuning / Saliency-guided prompt distillation techniques, Application / Medical Imaging / Robust segmentation in clinical settings, Evaluation / Model Adaptation Evaluation / Performance gains on region and boundary segmentation
License CC BY 4.0

Abstract Overview

This paper introduces Saliency-Guided Prompt Distillation (SPD), a two-stage framework for adapting the Segment Anything Model (SAM) to medical image segmentation when only noisy, non-task-specific prompts are available. In the first stage, a lightweight saliency head is trained alongside LoRA-adapted encoder features to learn anatomical priors from ground-truth masks, producing saliency maps that indicate plausible target locations. In the second stage, a Contextual Prompt Distillation (CPD) module validates local prompts against the saliency map, enriches them with cross-validated prompts from neighboring slices, and forms a consensus prompt set for SAM decoding. A Pairwise Slice Consistency (PSC) loss enforces anatomical coherence between adjacent slice predictions. The method is evaluated on four MRI and CT datasets, including a real clinical terminal ileum dataset with centerline prompts and three datasets with simulated noisy prompts.

Novelty

The primary novelty is a framework explicitly designed for robustness to noisy prompts in SAM-based medical image segmentation, rather than assuming high-quality prompts or addressing noisy masks/labels. Its key contribution lies in combining saliency-based anatomical prior learning, a dual-validation cross-slice contextual prompt distillation mechanism, and a localized pairwise slice consistency loss to convert unreliable clinical prompts into consensus guidance.

Results

Across four datasets (TI, Scar, FUMPE, KiTS), SPD achieves statistically significant improvements (p < 0.05, Wilcoxon signed-rank test) over all comparison methods on TI, Scar, and FUMPE for all reported metrics, and achieves the highest scores on KiTS. On the TI dataset, SPD reports 73.58 DSC and 23.94 HD95, representing an 11.08% DSC increase and 6.28 HD95 reduction over the best competing method. Ablation studies show incremental gains from local prompt validation, CPD, and PSC, and zero-shot experiments demonstrate that consensus prompts improve frozen SAM performance over full original centerline prompts by 14.2% DSC and 13.6% IoU.

Key Points

  1. SPD learns saliency-based anatomical priors via a lightweight head and uses them to filter noisy prompts on the current slice and cross-validate prompts from neighboring slices, forming a consensus prompt set before SAM decoding.
  2. The method targets a clinically realistic setting where inference-time prompts are imperfect, demonstrated with real centerline annotations on terminal ileum MRI and simulated noisy prompts (1 true positive point plus 2-5 random points) on three additional datasets.
  3. Experiments show statistically significant improvements over both conventional supervised baselines and SAM-based adaptations on most datasets, with ablations confirming that each component—local prompt validation, contextual prompt distillation, and pairwise slice consistency—contributes to the overall gains.

References

This page was created using generative AI such as GPT-5, Claude Opus 4, Gemini 3, Gemini 3.1 Flash Image, and their higher-end successor versions. No guarantee can be made regarding its contents.