Related papers: SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective

SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective

URL: http://arxiv.org/abs/2304.14674v1
Date: Fri, 28 Apr 2023 08:06:33 GMT
Title: SAM Meets Robotic Surgery: An Empirical Study in Robustness Perspective
Authors: An Wang, Mobarakol Islam, Mengya Xu, Yang Zhang, Hongliang Ren
Abstract summary: Segment Anything Model (SAM) is a foundation model for semantic segmentation. We investigate the robustness and zero-shot generalizability of the SAM in the domain of robotic surgery.
Score: 21.2080716792596
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Segment Anything Model (SAM) is a foundation model for semantic segmentation and shows excellent generalization capability with the prompts. In this empirical study, we investigate the robustness and zero-shot generalizability of the SAM in the domain of robotic surgery in various settings of (i) prompted vs. unprompted; (ii) bounding box vs. points-based prompt; (iii) generalization under corruptions and perturbations with five severity levels; and (iv) state-of-the-art supervised model vs. SAM. We conduct all the observations with two well-known robotic instrument segmentation datasets of MICCAI EndoVis 2017 and 2018 challenges. Our extensive evaluation results reveal that although SAM shows remarkable zero-shot generalization ability with bounding box prompts, it struggles to segment the whole instrument with point-based prompts and unprompted settings. Furthermore, our qualitative figures demonstrate that the model either failed to predict the parts of the instrument mask (e.g., jaws, wrist) or predicted parts of the instrument as different classes in the scenario of overlapping instruments within the same bounding box or with the point-based prompt. In fact, it is unable to identify instruments in some complex surgical scenarios of blood, reflection, blur, and shade. Additionally, SAM is insufficiently robust to maintain high performance when subjected to various forms of data corruption. Therefore, we can argue that SAM is not ready for downstream surgical tasks without further domain-specific fine-tuning.

Related papers

Inspiring the Next Generation of Segment Anything Models: Comprehensively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Concepts under Different Scenes [63.966251473172036]
The foundational model SAM has influenced multiple fields within computer vision, and its upgraded version, SAM 2, enhances capabilities in video segmentation. While SAMs have demonstrated excellent performance in segmenting context-independent concepts like people, cars, and roads, they overlook more challenging context-dependent (CD) concepts, such as visual saliency, camouflage, product defects, and medical lesions. We conduct a thorough quantitative evaluation of SAMs on 11 CD concepts across 2D and 3D images and videos in various visual modalities within natural, medical, and industrial scenes.
arXiv Detail & Related papers (2024-12-02T08:03:56Z)
Adapting Segment Anything Model for Unseen Object Instance Segmentation [70.60171342436092]
Unseen Object Instance (UOIS) is crucial for autonomous robots operating in unstructured environments. We propose UOIS-SAM, a data-efficient solution for the UOIS task. UOIS-SAM integrates two key components: (i) a Heatmap-based Prompt Generator (HPG) to generate class-agnostic point prompts with precise foreground prediction, and (ii) a Hierarchical Discrimination Network (HDNet) that adapts SAM's mask decoder.
arXiv Detail & Related papers (2024-09-23T19:05:50Z)
SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation [13.609341065893739]
This study explores the zero-shot segmentation performance of SAM 2 in robot-assisted surgery based on prompts. We employ two forms of prompts: 1-point and bounding box, while for video sequences, the 1-point prompt is applied to the initial frame. The results with point prompts also exhibit a substantial enhancement over SAM's capabilities, nearing or even surpassing existing unprompted SOTA methods.
arXiv Detail & Related papers (2024-08-08T17:08:57Z)
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery [15.748043194987075]
This work assesses Segment Anything Model capabilities in segmenting objects of interest in the X-ray/infrared modalities. Our results show that SAM can segment objects in the X-ray modality when given a box prompt, but its performance varies for point prompts. We find that infrared objects are also challenging to segment with point prompts given the low-contrast nature of this modality.
arXiv Detail & Related papers (2024-04-18T16:04:14Z)
SurgicalPart-SAM: Part-to-Whole Collaborative Prompting for Surgical Instrument Segmentation [66.21356751558011]
The Segment Anything Model (SAM) exhibits promise in generic object segmentation and offers potential for various applications. Existing methods have applied SAM to surgical instrument segmentation (SIS) by tuning SAM-based frameworks with surgical data. We propose SurgicalPart-SAM (SP-SAM), a novel SAM efficient-tuning approach that explicitly integrates instrument structure knowledge with SAM's generic knowledge.
arXiv Detail & Related papers (2023-12-22T07:17:51Z)
Boosting Segment Anything Model Towards Open-Vocabulary Learning [69.42565443181017]
Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model. Despite SAM finding applications and adaptations in various domains, its primary limitation lies in the inability to grasp object semantics. We present Sambor to seamlessly integrate SAM with the open-vocabulary object detector in an end-to-end framework.
arXiv Detail & Related papers (2023-12-06T17:19:00Z)
SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation. Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes. In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z)
SAM Meets Robotic Surgery: An Empirical Study on Generalization, Robustness and Adaptation [15.995869434429274]
The Segment Anything Model (SAM) serves as a fundamental model for semantic segmentation. We examine SAM's robustness and zero-shot generalizability in the field of robotic surgery.
arXiv Detail & Related papers (2023-08-14T14:09:41Z)
On the Robustness of Segment Anything [46.669794757467166]
We aim to study the testing-time robustness of SAM under adversarial scenarios and common corruptions. We find that SAM exhibits remarkable robustness against various corruptions, except for blur-related corruption.
arXiv Detail & Related papers (2023-05-25T16:28:30Z)
SAM Struggles in Concealed Scenes -- Empirical Study on "Segment Anything" [132.31628334155118]
Segment Anything Model (SAM) fosters the foundation models for computer vision. In this report, we choose three concealed scenes, i.e., camouflaged animals, industrial defects, and medical lesions, to evaluate SAM under unprompted settings. Our main observation is that SAM looks unskilled in concealed scenes.
arXiv Detail & Related papers (2023-04-12T17:58:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.