From Forks to Forceps: A New Framework for Instance Segmentation of
Surgical Instruments
- URL: http://arxiv.org/abs/2211.16200v1
- Date: Sat, 26 Nov 2022 21:26:42 GMT
- Title: From Forks to Forceps: A New Framework for Instance Segmentation of
Surgical Instruments
- Authors: Britty Baby, Daksh Thapar, Mustafa Chasmai, Tamajit Banerjee, Kunal
Dargan, Ashish Suri, Subhashis Banerjee, Chetan Arora
- Abstract summary: Minimally invasive surgeries and related applications demand surgical tool classification and segmentation at the instance level.
Our research demonstrates that while the bounding box and segmentation mask are often accurate, the classification head mis-classifies the class label of the surgical instrument.
We present a new neural network framework that adds a classification module as a new stage to existing instance segmentation models.
- Score: 6.677634562400846
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Minimally invasive surgeries and related applications demand surgical tool
classification and segmentation at the instance level. Surgical tools are
similar in appearance and are long, thin, and handled at an angle. The
fine-tuning of state-of-the-art (SOTA) instance segmentation models trained on
natural images for instrument segmentation has difficulty discriminating
instrument classes. Our research demonstrates that while the bounding box and
segmentation mask are often accurate, the classification head mis-classifies
the class label of the surgical instrument. We present a new neural network
framework that adds a classification module as a new stage to existing instance
segmentation models. This module specializes in improving the classification of
instrument masks generated by the existing model. The module comprises
multi-scale mask attention, which attends to the instrument region and masks
the distracting background features. We propose training our classifier module
using metric learning with arc loss to handle low inter-class variance of
surgical instruments. We conduct exhaustive experiments on the benchmark
datasets EndoVis2017 and EndoVis2018. We demonstrate that our method
outperforms all (more than 18) SOTA methods compared with, and improves the
SOTA performance by at least 12 points (20%) on the EndoVis2017 benchmark
challenge and generalizes effectively across the datasets.
Related papers
- Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification [10.667645628712542]
This paper proposes the first Vision-Language-based framework with Queryable Prototype Multiple Instance Learning (QPMIL-VL) specially designed for incremental WSI classification.
experiments on four TCGA datasets demonstrate that our QPMIL-VL framework is effective for incremental WSI classification.
arXiv Detail & Related papers (2024-10-14T14:49:34Z) - PosSAM: Panoptic Open-vocabulary Segment Anything [58.72494640363136]
PosSAM is an open-vocabulary panoptic segmentation model that unifies the strengths of the Segment Anything Model (SAM) with the vision-native CLIP model in an end-to-end framework.
We introduce a Mask-Aware Selective Ensembling (MASE) algorithm that adaptively enhances the quality of generated masks and boosts the performance of open-vocabulary classification during inference for each image.
arXiv Detail & Related papers (2024-03-14T17:55:03Z) - SAF-IS: a Spatial Annotation Free Framework for Instance Segmentation of
Surgical Tools [10.295921059528636]
We develop a framework for instance segmentation not relying on spatial annotations for training.
Our solution only requires binary tool masks, obtainable using recent unsupervised approaches, and binary tool presence labels.
We validate our framework on the EndoVis 2017 and 2018 segmentation datasets.
arXiv Detail & Related papers (2023-09-04T17:13:06Z) - SurgicalSAM: Efficient Class Promptable Surgical Instrument Segmentation [65.52097667738884]
We introduce SurgicalSAM, a novel end-to-end efficient-tuning approach for SAM to integrate surgical-specific information with SAM's pre-trained knowledge for improved generalisation.
Specifically, we propose a lightweight prototype-based class prompt encoder for tuning, which directly generates prompt embeddings from class prototypes.
In addition, to address the low inter-class variance among surgical instrument categories, we propose contrastive prototype learning.
arXiv Detail & Related papers (2023-08-17T02:51:01Z) - SegMatch: A semi-supervised learning method for surgical instrument
segmentation [10.223709180135419]
We propose SegMatch, a semi supervised learning method to reduce the need for expensive annotation for laparoscopic and robotic surgical images.
SegMatch builds on FixMatch, a widespread semi supervised classification pipeline combining consistency regularization and pseudo labelling.
Our results demonstrate that adding unlabelled data for training purposes allows us to surpass the performance of fully supervised approaches.
arXiv Detail & Related papers (2023-08-09T21:30:18Z) - Domain Adaptive Nuclei Instance Segmentation and Classification via
Category-aware Feature Alignment and Pseudo-labelling [65.40672505658213]
We propose a novel deep neural network, namely Category-Aware feature alignment and Pseudo-Labelling Network (CAPL-Net) for UDA nuclei instance segmentation and classification.
Our approach outperforms state-of-the-art UDA methods with a remarkable margin.
arXiv Detail & Related papers (2022-07-04T07:05:06Z) - TraSeTR: Track-to-Segment Transformer with Contrastive Query for
Instance-level Instrument Segmentation in Robotic Surgery [60.439434751619736]
We propose TraSeTR, a Track-to-Segment Transformer that exploits tracking cues to assist surgical instrument segmentation.
TraSeTR jointly reasons about the instrument type, location, and identity with instance-level predictions.
The effectiveness of our method is demonstrated with state-of-the-art instrument type segmentation results on three public datasets.
arXiv Detail & Related papers (2022-02-17T05:52:18Z) - FUN-SIS: a Fully UNsupervised approach for Surgical Instrument
Segmentation [16.881624842773604]
We present FUN-SIS, a Fully-supervised approach for binary Surgical Instrument.
We train a per-frame segmentation model on completely unlabelled endoscopic videos, by relying on implicit motion information and instrument shape-priors.
The obtained fully-unsupervised results for surgical instrument segmentation are almost on par with the ones of fully-supervised state-of-the-art approaches.
arXiv Detail & Related papers (2022-02-16T15:32:02Z) - Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS)
It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes.
In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image.
We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z) - ISINet: An Instance-Based Approach for Surgical Instrument Segmentation [0.0]
We study the task of semantic segmentation of surgical instruments in robotic-assisted surgery scenes.
We propose ISINet, a method that addresses this task from an instance-based segmentation perspective.
Our results show that ISINet significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-07-10T16:20:56Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.