Related papers: Real-Time Instrument Segmentation in Robotic Surgery using Auxiliary Supervised Deep Adversarial Learning

Real-Time Instrument Segmentation in Robotic Surgery using Auxiliary Supervised Deep Adversarial Learning

URL: http://arxiv.org/abs/2007.11319v2
Date: Wed, 30 Sep 2020 09:07:50 GMT
Title: Real-Time Instrument Segmentation in Robotic Surgery using Auxiliary Supervised Deep Adversarial Learning
Authors: Mobarakol Islam, Daniel A. Atputharuban, Ravikiran Ramesh, Hongliang Ren
Abstract summary: Real-time semantic segmentation of the robotic instruments and tissues is a crucial step in robot-assisted surgery. We have developed a light-weight cascaded convolutional neural network (CNN) to segment the surgical instruments from high-resolution videos. We show that our model surpasses existing algorithms for pixel-wise segmentation of surgical instruments in both prediction accuracy and segmentation time of high-resolution videos.
Score: 15.490603884631764
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robot-assisted surgery is an emerging technology which has undergone rapid growth with the development of robotics and imaging systems. Innovations in vision, haptics and accurate movements of robot arms have enabled surgeons to perform precise minimally invasive surgeries. Real-time semantic segmentation of the robotic instruments and tissues is a crucial step in robot-assisted surgery. Accurate and efficient segmentation of the surgical scene not only aids in the identification and tracking of instruments but also provided contextual information about the different tissues and instruments being operated with. For this purpose, we have developed a light-weight cascaded convolutional neural network (CNN) to segment the surgical instruments from high-resolution videos obtained from a commercial robotic system. We propose a multi-resolution feature fusion module (MFF) to fuse the feature maps of different dimensions and channels from the auxiliary and main branch. We also introduce a novel way of combining auxiliary loss and adversarial loss to regularize the segmentation model. Auxiliary loss helps the model to learn low-resolution features, and adversarial loss improves the segmentation prediction by learning higher order structural information. The model also consists of a light-weight spatial pyramid pooling (SPP) unit to aggregate rich contextual information in the intermediate stage. We show that our model surpasses existing algorithms for pixel-wise segmentation of surgical instruments in both prediction accuracy and segmentation time of high-resolution videos.

Related papers

CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images. The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism. We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z)
Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip Segmentation in Robotic Surgeries [29.201385352740555]
We propose a novel visual-kinematics graph learning framework to accurately segment the instrument tip given various surgical procedures. Specifically, a graph learning framework is proposed to encode relational features of instrument parts from both image and kinematics. A cross-modal contrastive loss is designed to incorporate robust geometric prior from kinematics to image for tip segmentation.
arXiv Detail & Related papers (2023-09-02T14:52:58Z)
Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery [29.72271827272853]
This work explores a new task of Referring Surgical Video Instrument (RSVIS) It aims to automatically identify and segment the corresponding surgical instruments based on the given language expression. We devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance.
arXiv Detail & Related papers (2023-08-18T11:24:06Z)
GLSFormer : Gated - Long, Short Sequence Transformer for Step Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches. We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z)
Robotic Navigation Autonomy for Subretinal Injection via Intelligent Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection. Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target. Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z)
One to Many: Adaptive Instrument Segmentation via Meta Learning and Dynamic Online Adaptation in Robotic Surgical Video [71.43912903508765]
MDAL is a dynamic online adaptive learning scheme for instrument segmentation in robot-assisted surgery. It learns the general knowledge of instruments and the fast adaptation ability through the video-specific meta-learning paradigm. It outperforms other state-of-the-art methods on two datasets.
arXiv Detail & Related papers (2021-03-24T05:02:18Z)
Relational Graph Learning on Visual and Kinematics Embeddings for Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information. The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z)
Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery [10.562627972607892]
We show that it may be possible to use robot kinematic data coupled with laparoscopic images to alleviate the labelling problem. We propose a new deep learning based model for parallel processing of both laparoscopic and simulation images.
arXiv Detail & Related papers (2020-07-17T16:33:33Z)
Searching for Efficient Architecture for Instrument Segmentation in Robotic Surgery [58.63306322525082]
Most applications rely on accurate real-time segmentation of high-resolution surgical images. We design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images.
arXiv Detail & Related papers (2020-07-08T21:38:29Z)
AP-MTL: Attention Pruned Multi-task Learning Model for Real-time Instrument Detection and Segmentation in Robot-assisted Surgery [23.33984309289549]
Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource. We develop a novel end-to-end trainable real-time Multi-Task Learning model with weight-shared encoder and task-aware detection and segmentation decoders. Our model significantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.
arXiv Detail & Related papers (2020-03-10T14:24:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.