Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2
- URL: http://arxiv.org/abs/2408.01648v1
- Date: Sat, 3 Aug 2024 03:19:56 GMT
- Title: Zero-Shot Surgical Tool Segmentation in Monocular Video Using Segment Anything Model 2
- Authors: Ange Lou, Yamin Li, Yike Zhang, Robert F. Labadie, Jack Noble,
- Abstract summary: The Segment Anything Model 2 (SAM 2) is the latest generation foundation model for image and video segmentation.
We evaluate the zero-shot video segmentation performance of the SAM 2 model across different types of surgeries, including endoscopy and microscopy.
We found that: 1) SAM 2 demonstrates a strong capability for segmenting various surgical videos; 2) When new tools enter the scene, additional prompts are necessary to maintain segmentation accuracy; and 3) Specific challenges inherent to surgical videos can impact the robustness of SAM 2.
- Score: 4.418542191434178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Segment Anything Model 2 (SAM 2) is the latest generation foundation model for image and video segmentation. Trained on the expansive Segment Anything Video (SA-V) dataset, which comprises 35.5 million masks across 50.9K videos, SAM 2 advances its predecessor's capabilities by supporting zero-shot segmentation through various prompts (e.g., points, boxes, and masks). Its robust zero-shot performance and efficient memory usage make SAM 2 particularly appealing for surgical tool segmentation in videos, especially given the scarcity of labeled data and the diversity of surgical procedures. In this study, we evaluate the zero-shot video segmentation performance of the SAM 2 model across different types of surgeries, including endoscopy and microscopy. We also assess its performance on videos featuring single and multiple tools of varying lengths to demonstrate SAM 2's applicability and effectiveness in the surgical domain. We found that: 1) SAM 2 demonstrates a strong capability for segmenting various surgical videos; 2) When new tools enter the scene, additional prompts are necessary to maintain segmentation accuracy; and 3) Specific challenges inherent to surgical videos can impact the robustness of SAM 2.
Related papers
- DB-SAM: Delving into High Quality Universal Medical Image Segmentation [100.63434169944853]
We propose a dual-branch adapted SAM framework, named DB-SAM, to bridge the gap between natural and 2D/3D medical data.
Our proposed DB-SAM achieves an absolute gain of 8.8%, compared to a recent medical SAM adapter in the literature.
arXiv Detail & Related papers (2024-10-05T14:36:43Z) - SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation [51.90445260276897]
We prove that the Segment Anything Model 2 (SAM2) can be a strong encoder for U-shaped segmentation models.
We propose a simple but effective framework, termed SAM2-UNet, for versatile image segmentation.
arXiv Detail & Related papers (2024-08-16T17:55:38Z) - Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning [13.90996725220123]
We introduce Surgical SAM 2 (SurgSAM-2), an advanced model to utilize SAM2 with an Efficient Frame Pruning mechanism.
SurgSAM-2 significantly improves both efficiency and segmentation accuracy compared to the vanilla SAM2.
Remarkably, SurgSAM-2 achieves a 3$times$ FPS compared with SAM2, while also delivering state-of-the-art performance after fine-tuning with lower-resolution data.
arXiv Detail & Related papers (2024-08-15T04:59:12Z) - Novel adaptation of video segmentation to 3D MRI: efficient zero-shot knee segmentation with SAM2 [1.6237741047782823]
We introduce a method for zero-shot, single-prompt segmentation of 3D knee MRI by adapting Segment Anything Model 2.
By treating slices from 3D medical volumes as individual video frames, we leverage SAM2's advanced capabilities to generate motion- and spatially-aware predictions.
We demonstrate that SAM2 can efficiently perform segmentation tasks in a zero-shot manner with no additional training or fine-tuning.
arXiv Detail & Related papers (2024-08-08T21:39:15Z) - SAM 2 in Robotic Surgery: An Empirical Evaluation for Robustness and Generalization in Surgical Video Segmentation [13.609341065893739]
This study explores the zero-shot segmentation performance of SAM 2 in robot-assisted surgery based on prompts.
We employ two forms of prompts: 1-point and bounding box, while for video sequences, the 1-point prompt is applied to the initial frame.
The results with point prompts also exhibit a substantial enhancement over SAM's capabilities, nearing or even surpassing existing unprompted SOTA methods.
arXiv Detail & Related papers (2024-08-08T17:08:57Z) - Segment Anything in Medical Images and Videos: Benchmark and Deployment [8.51742337818826]
We first present a comprehensive benchmarking of the Segment Anything Model 2 (SAM2) across 11 medical image modalities and videos.
Then, we develop a transfer learning pipeline and demonstrate SAM2 can be quickly adapted to medical domain by fine-tuning.
We implement SAM2 as a 3D slicer plugin and Gradio API for efficient 3D image and video segmentation.
arXiv Detail & Related papers (2024-08-06T17:58:18Z) - Segment anything model 2: an application to 2D and 3D medical images [16.253160684182895]
Segment Anything Model (SAM) has gained significant attention because of its ability to segment various objects in images given a prompt.
Recently developed SAM 2 has extended this ability to video inputs.
This opens an opportunity to apply SAM to 3D images, one of the fundamental tasks in the medical imaging field.
arXiv Detail & Related papers (2024-08-01T17:57:25Z) - SAM 2: Segment Anything in Images and Videos [63.44869623822368]
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos.
We build a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date.
Our model is a simple transformer architecture with streaming memory for real-time video processing.
arXiv Detail & Related papers (2024-08-01T17:00:08Z) - MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image
Segmentation [58.53672866662472]
We introduce a modality-agnostic SAM adaptation framework, named as MA-SAM.
Our method roots in the parameter-efficient fine-tuning strategy to update only a small portion of weight increments.
By injecting a series of 3D adapters into the transformer blocks of the image encoder, our method enables the pre-trained 2D backbone to extract third-dimensional information from input data.
arXiv Detail & Related papers (2023-09-16T02:41:53Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - Medical SAM Adapter: Adapting Segment Anything Model for Medical Image
Segmentation [51.770805270588625]
The Segment Anything Model (SAM) has recently gained popularity in the field of image segmentation.
Recent studies and individual experiments have shown that SAM underperforms in medical image segmentation.
We propose the Medical SAM Adapter (Med-SA), which incorporates domain-specific medical knowledge into the segmentation model.
arXiv Detail & Related papers (2023-04-25T07:34:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.