SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation
- URL: http://arxiv.org/abs/2308.08746v2
- Date: Thu, 21 Dec 2023 11:56:08 GMT
- Title: SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation
- Authors: Wenxi Yue, Jing Zhang, Kun Hu, Yong Xia, Jiebo Luo, Zhiyong Wang
- Abstract summary: We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
- Score: 65.52097667738884
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Segment Anything Model (SAM) is a powerful foundation model that has
revolutionised image segmentation. To apply SAM to surgical instrument
segmentation, a common approach is to locate precise points or boxes of
instruments and then use them as prompts for SAM in a zero-shot manner.
However, we observe two problems with this naive pipeline: (1) the domain gap
between natural objects and surgical instruments leads to inferior
generalisation of SAM; and (2) SAM relies on precise point or box locations for
accurate segmentation, requiring either extensive manual guidance or a
well-performing specialist detector for prompt preparation, which leads to a
complex multi-stage pipeline. To address these problems, we introduce
SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to
effectively integrate surgical-specific information with SAM's pre-trained
knowledge for improved generalisation. Specifically, we propose a lightweight
prototype-based class prompt encoder for tuning, which directly generates
prompt embeddings from class prototypes and eliminates the use of explicit
prompts for improved robustness and a simpler pipeline. In addition, to address
the low inter-class variance among surgical instrument categories, we propose
contrastive prototype learning, further enhancing the discrimination of the
class prototypes for more accurate class prompting. The results of extensive
experiments on both EndoVis2018 and EndoVis2017 datasets demonstrate that
SurgicalSAM achieves state-of-the-art performance while only requiring a small
number of tunable parameters. The source code is available at
https://github.com/wenxi-yue/SurgicalSAM.
Related papers
- ASPS: Augmented Segment Anything Model for Polyp Segmentation [77.25557224490075]
The Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation.
SAM's Transformer-based structure prioritizes global and low-frequency information.
CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge.
arXiv Detail & Related papers (2024-06-30T14:55:32Z) - Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation [52.172885882728174]
In medical imaging contexts, it is not uncommon for human experts to rectify segmentations of specific test samples after SAM generates its segmentation predictions.
We introduce a novel approach that leverages the advantages of online machine learning to enhance Segment Anything (SA) during test time.
We employ rectified annotations to perform online learning, with the aim of improving the segmentation quality of SA on medical images.
arXiv Detail & Related papers (2024-06-03T03:16:25Z) - AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning [61.666973416903005]
Segment Anything Model (SAM) has demonstrated its impressive generalization capabilities in open-world scenarios with the guidance of prompts.
We propose a novel framework, termed AlignSAM, designed for automatic prompting for aligning SAM to an open context.
arXiv Detail & Related papers (2024-06-01T16:21:39Z) - Surgical-DeSAM: Decoupling SAM for Instrument Segmentation in Robotic Surgery [9.466779367920049]
In safety-critical surgical tasks, prompting is not possible due to lack of per-frame prompts for supervised learning.
It is unrealistic to prompt frame-by-frame in a real-time tracking application, and it is expensive to annotate prompts for offline applications.
We develop Surgical-DeSAM to generate automatic bounding box prompts for decoupling SAM to obtain instrument segmentation in real-time robotic surgery.
arXiv Detail & Related papers (2024-04-22T09:53:55Z) - SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation [66.21356751558011]
The Segment Anything Model (SAM) exhibits promise in generic object segmentation and offers potential for various applications.
Existing methods have applied SAM to surgical instrument segmentation (SIS) by tuning SAM-based frameworks with surgical data.
We propose SurgicalPart-SAM (SP-SAM), a novel SAM efficient-tuning approach that explicitly integrates instrument structure knowledge with SAM's generic knowledge.
arXiv Detail & Related papers (2023-12-22T07:17:51Z) - Beyond Adapting SAM: Towards End-to-End Ultrasound Image Segmentation via Auto Prompting [10.308637269138146]
We propose SAMUS as a universal model tailored for ultrasound image segmentation.
We further enable it to work in an end-to-end manner denoted as AutoSAMUS.
AutoSAMUS is realized by introducing an auto prompt generator (APG) to replace the manual prompt encoder of SAMUS.
arXiv Detail & Related papers (2023-09-13T09:15:20Z) - SAM Meets Robotic Surgery: An Empirical Study on Generalization,
Robustness and Adaptation [15.995869434429274]
The Segment Anything Model (SAM) serves as a fundamental model for semantic segmentation.
We examine SAM's robustness and zero-shot generalizability in the field of robotic surgery.
arXiv Detail & Related papers (2023-08-14T14:09:41Z) - AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene
Segmentation [49.59991322513561]
We propose an adaptive modification of Segment-Anything (SAM) that can adjust to new datasets quickly and efficiently.
AdaptiveSAM uses free-form text as prompt and can segment the object of interest with just the label name as prompt.
Our experiments show that AdaptiveSAM outperforms current state-of-the-art methods on various medical imaging datasets.
arXiv Detail & Related papers (2023-08-07T17:12:54Z) - SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective [21.2080716792596]
Segment Anything Model (SAM) is a foundation model for semantic segmentation.
We investigate the robustness and zero-shot generalizability of the SAM in the domain of robotic surgery.
arXiv Detail & Related papers (2023-04-28T08:06:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.