Revisiting Surgical Instrument Segmentation Without Human Intervention: A Graph Partitioning View
- URL: http://arxiv.org/abs/2408.14789v3
- Date: Thu, 7 Nov 2024 02:43:03 GMT
- Title: Revisiting Surgical Instrument Segmentation Without Human Intervention: A Graph Partitioning View
- Authors: Mingyu Sheng, Jianan Fan, Dongnan Liu, Ron Kikinis, Weidong Cai,
- Abstract summary: We propose an unsupervised method by reframing the video frame segmentation as a graph partitioning problem.
A self-supervised pre-trained model is firstly leveraged as a feature extractor to capture high-level semantic features.
On the "deep" eigenvectors, a surgical video frame is meaningfully segmented into different modules like tools and tissues, providing distinguishable semantic information.
- Score: 7.594796294925481
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Surgical instrument segmentation (SIS) on endoscopic images stands as a long-standing and essential task in the context of computer-assisted interventions for boosting minimally invasive surgery. Given the recent surge of deep learning methodologies and their data-hungry nature, training a neural predictive model based on massive expert-curated annotations has been dominating and served as an off-the-shelf approach in the field, which could, however, impose prohibitive burden to clinicians for preparing fine-grained pixel-wise labels corresponding to the collected surgical video frames. In this work, we propose an unsupervised method by reframing the video frame segmentation as a graph partitioning problem and regarding image pixels as graph nodes, which is significantly different from the previous efforts. A self-supervised pre-trained model is firstly leveraged as a feature extractor to capture high-level semantic features. Then, Laplacian matrixs are computed from the features and are eigendecomposed for graph partitioning. On the "deep" eigenvectors, a surgical video frame is meaningfully segmented into different modules such as tools and tissues, providing distinguishable semantic information like locations, classes, and relations. The segmentation problem can then be naturally tackled by applying clustering or threshold on the eigenvectors. Extensive experiments are conducted on various datasets (e.g., EndoVis2017, EndoVis2018, UCL, etc.) for different clinical endpoints. Across all the challenging scenarios, our method demonstrates outstanding performance and robustness higher than unsupervised state-of-the-art (SOTA) methods. The code is released at https://github.com/MingyuShengSMY/GraphClusteringSIS.git.
Related papers
- UnSegMedGAT: Unsupervised Medical Image Segmentation using Graph Attention Networks Clustering [10.862430265350804]
We propose an unsupervised segmentation framework using a pre-trained Dino-ViT.
We leverage the inherent graph structure within the image to realize a significant performance gain for segmentation in medical images.
Our method achieves state-of-the-art performance, even significantly surpassing or matching that of existing (semi) technique such as MedSAM.
arXiv Detail & Related papers (2024-11-04T10:42:21Z) - UnSeGArmaNet: Unsupervised Image Segmentation using Graph Neural Networks with Convolutional ARMA Filters [10.940349832919699]
We propose an unsupervised segmentation framework with a pre-trained ViT.
By harnessing the graph structure inherent within the image, the proposed method achieves a notable performance in segmentation.
The proposed method provides state-of-the-art performance (even comparable to supervised methods) on benchmark image segmentation datasets.
arXiv Detail & Related papers (2024-10-08T15:10:09Z) - DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut [62.63481844384229]
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.
In this paper, we use a diffusion UNet encoder as a foundation vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method.
Our work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks.
arXiv Detail & Related papers (2024-06-05T01:32:31Z) - UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks [9.268228808049951]
This research contributes to the broader field of unsupervised medical imaging and computer vision.
It presents an innovative methodology for image segmentation that aligns with real-world challenges.
The proposed method holds promise for diverse applications, including medical imaging, remote sensing, and object recognition.
arXiv Detail & Related papers (2024-05-09T19:02:00Z) - Self-Supervised Correction Learning for Semi-Supervised Biomedical Image
Segmentation [84.58210297703714]
We propose a self-supervised correction learning paradigm for semi-supervised biomedical image segmentation.
We design a dual-task network, including a shared encoder and two independent decoders for segmentation and lesion region inpainting.
Experiments on three medical image segmentation datasets for different tasks demonstrate the outstanding performance of our method.
arXiv Detail & Related papers (2023-01-12T08:19:46Z) - CUTS: A Deep Learning and Topological Framework for Multigranular Unsupervised Medical Image Segmentation [8.307551496968156]
We present CUTS, an unsupervised deep learning framework for medical image segmentation.
For each image, it produces an embedding map via intra-image contrastive learning and local patch reconstruction.
CUTS yields a series of coarse-to-fine-grained segmentations that highlight features at various granularities.
arXiv Detail & Related papers (2022-09-23T01:09:06Z) - Pseudo-label Guided Cross-video Pixel Contrast for Robotic Surgical
Scene Segmentation with Limited Annotations [72.15956198507281]
We propose PGV-CL, a novel pseudo-label guided cross-video contrast learning method to boost scene segmentation.
We extensively evaluate our method on a public robotic surgery dataset EndoVis18 and a public cataract dataset CaDIS.
arXiv Detail & Related papers (2022-07-20T05:42:19Z) - Min-Max Similarity: A Contrastive Learning Based Semi-Supervised
Learning Network for Surgical Tools Segmentation [0.0]
We propose a semi-supervised segmentation network based on contrastive learning.
In contrast to the previous state-of-the-art, we introduce a contrastive learning form of dual-view training.
Our proposed method outperforms state-of-the-art semi-supervised and fully supervised segmentation algorithms consistently.
arXiv Detail & Related papers (2022-03-29T01:40:26Z) - Temporally-Weighted Hierarchical Clustering for Unsupervised Action
Segmentation [96.67525775629444]
Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos.
We present a fully automatic and unsupervised approach for segmenting actions in a video that does not require any training.
Our proposal is an effective temporally-weighted hierarchical clustering algorithm that can group semantically consistent frames of the video.
arXiv Detail & Related papers (2021-03-20T23:30:01Z) - Co-Generation and Segmentation for Generalized Surgical Instrument
Segmentation on Unlabelled Data [49.419268399590045]
Surgical instrument segmentation for robot-assisted surgery is needed for accurate instrument tracking and augmented reality overlays.
Deep learning-based methods have shown state-of-the-art performance for surgical instrument segmentation, but their results depend on labelled data.
In this paper, we demonstrate the limited generalizability of these methods on different datasets, including human robot-assisted surgeries.
arXiv Detail & Related papers (2021-03-16T18:41:18Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.