SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective
- URL: http://arxiv.org/abs/2304.14674v1
- Date: Fri, 28 Apr 2023 08:06:33 GMT
- Title: SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective
- Authors: An Wang, Mobarakol Islam, Mengya Xu, Yang Zhang, Hongliang Ren
- Abstract summary: Segment Anything Model (SAM) is a foundation model for semantic segmentation.
We investigate the robustness and zero-shot generalizability of the SAM in the domain of robotic surgery.
- Score: 21.2080716792596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Segment Anything Model (SAM) is a foundation model for semantic segmentation
and shows excellent generalization capability with the prompts. In this
empirical study, we investigate the robustness and zero-shot generalizability
of the SAM in the domain of robotic surgery in various settings of (i) prompted
vs. unprompted; (ii) bounding box vs. points-based prompt; (iii) generalization
under corruptions and perturbations with five severity levels; and (iv)
state-of-the-art supervised model vs. SAM. We conduct all the observations with
two well-known robotic instrument segmentation datasets of MICCAI EndoVis 2017
and 2018 challenges. Our extensive evaluation results reveal that although SAM
shows remarkable zero-shot generalization ability with bounding box prompts, it
struggles to segment the whole instrument with point-based prompts and
unprompted settings. Furthermore, our qualitative figures demonstrate that the
model either failed to predict the parts of the instrument mask (e.g., jaws,
wrist) or predicted parts of the instrument as different classes in the
scenario of overlapping instruments within the same bounding box or with the
point-based prompt. In fact, it is unable to identify instruments in some
complex surgical scenarios of blood, reflection, blur, and shade. Additionally,
SAM is insufficiently robust to maintain high performance when subjected to
various forms of data corruption. Therefore, we can argue that SAM is not ready
for downstream surgical tasks without further domain-specific fine-tuning.
Related papers
- Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes [63.966251473172036]
The foundational model SAM has influenced multiple fields within computer vision, and its upgraded version, SAM 2, enhances capabilities in video segmentation.
While SAMs have demonstrated excellent performance in segmenting context-independent concepts like people, cars, and roads, they overlook more challenging context-dependent (CD) concepts, such as visual saliency, camouflage, product defects, and medical lesions.
We conduct a thorough quantitative evaluation of SAMs on 11 CD concepts across 2D and 3D images and videos in various visual modalities within natural, medical, and industrial scenes.
arXiv Detail & Related papers (2024-12-02T08:03:56Z) - Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments.
We propose UOIS-SAM, a data-efficient solution for the UOIS task.
UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z) - SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation [13.609341065893739]
This study explores the zero-shot segmentation performance of SAM 2 in robot-assisted surgery based on prompts.
We employ two forms of prompts: 1-point and bounding box, while for video sequences, the 1-point prompt is applied to the initial frame.
The results with point prompts also exhibit a substantial enhancement over SAM's capabilities, nearing or even surpassing existing unprompted SOTA methods.
arXiv Detail & Related papers (2024-08-08T17:08:57Z) - Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery [15.748043194987075]
This work assesses Segment Anything Model capabilities in segmenting objects of interest in the X-ray/infrared modalities.
Our results show that SAM can segment objects in the X-ray modality when given a box prompt, but its performance varies for point prompts.
We find that infrared objects are also challenging to segment with point prompts given the low-contrast nature of this modality.
arXiv Detail & Related papers (2024-04-18T16:04:14Z) - SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation [66.21356751558011]
The Segment Anything Model (SAM) exhibits promise in generic object segmentation and offers potential for various applications.
Existing methods have applied SAM to surgical instrument segmentation (SIS) by tuning SAM-based frameworks with surgical data.
We propose SurgicalPart-SAM (SP-SAM), a novel SAM efficient-tuning approach that explicitly integrates instrument structure knowledge with SAM's generic knowledge.
arXiv Detail & Related papers (2023-12-22T07:17:51Z) - Boosting Segment Anything Model Towards Open-Vocabulary Learning [69.24734826209367]
Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model.
Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics.
We present Sambor to seamlessly integrate SAM with the open-vocabulary object detector in an end-to-end framework.
arXiv Detail & Related papers (2023-12-06T17:19:00Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - SAM Meets Robotic Surgery: An Empirical Study on Generalization,
Robustness and Adaptation [15.995869434429274]
The Segment Anything Model (SAM) serves as a fundamental model for semantic segmentation.
We examine SAM's robustness and zero-shot generalizability in the field of robotic surgery.
arXiv Detail & Related papers (2023-08-14T14:09:41Z) - On the Robustness of Segment Anything [46.669794757467166]
We aim to study the testing-time robustness of SAM under adversarial scenarios and common corruptions.
We find that SAM exhibits remarkable robustness against various corruptions, except for blur-related corruption.
arXiv Detail & Related papers (2023-05-25T16:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.