Real-Time Instrument Segmentation in Robotic Surgery using Auxiliary
Supervised Deep Adversarial Learning
- URL: http://arxiv.org/abs/2007.11319v2
- Date: Wed, 30 Sep 2020 09:07:50 GMT
- Title: Real-Time Instrument Segmentation in Robotic Surgery using Auxiliary
Supervised Deep Adversarial Learning
- Authors: Mobarakol Islam, Daniel A. Atputharuban, Ravikiran Ramesh, Hongliang
Ren
- Abstract summary: Real-time semantic segmentation of the robotic instruments and tissues is a crucial step in robot-assisted surgery.
We have developed a light-weight cascaded convolutional neural network (CNN) to segment the surgical instruments from high-resolution videos.
We show that our model surpasses existing algorithms for pixel-wise segmentation of surgical instruments in both prediction accuracy and segmentation time of high-resolution videos.
- Score: 15.490603884631764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Robot-assisted surgery is an emerging technology which has undergone rapid
growth with the development of robotics and imaging systems. Innovations in
vision, haptics and accurate movements of robot arms have enabled surgeons to
perform precise minimally invasive surgeries. Real-time semantic segmentation
of the robotic instruments and tissues is a crucial step in robot-assisted
surgery. Accurate and efficient segmentation of the surgical scene not only
aids in the identification and tracking of instruments but also provided
contextual information about the different tissues and instruments being
operated with. For this purpose, we have developed a light-weight cascaded
convolutional neural network (CNN) to segment the surgical instruments from
high-resolution videos obtained from a commercial robotic system. We propose a
multi-resolution feature fusion module (MFF) to fuse the feature maps of
different dimensions and channels from the auxiliary and main branch. We also
introduce a novel way of combining auxiliary loss and adversarial loss to
regularize the segmentation model. Auxiliary loss helps the model to learn
low-resolution features, and adversarial loss improves the segmentation
prediction by learning higher order structural information. The model also
consists of a light-weight spatial pyramid pooling (SPP) unit to aggregate rich
contextual information in the intermediate stage. We show that our model
surpasses existing algorithms for pixel-wise segmentation of surgical
instruments in both prediction accuracy and segmentation time of
high-resolution videos.
Related papers
- CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip
Segmentation in Robotic Surgeries [29.201385352740555]
We propose a novel visual-kinematics graph learning framework to accurately segment the instrument tip given various surgical procedures.
Specifically, a graph learning framework is proposed to encode relational features of instrument parts from both image and kinematics.
A cross-modal contrastive loss is designed to incorporate robust geometric prior from kinematics to image for tip segmentation.
arXiv Detail & Related papers (2023-09-02T14:52:58Z) - Video-Instrument Synergistic Network for Referring Video Instrument
Segmentation in Robotic Surgery [29.72271827272853]
This work explores a new task of Referring Surgical Video Instrument (RSVIS)
It aims to automatically identify and segment the corresponding surgical instruments based on the given language expression.
We devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance.
arXiv Detail & Related papers (2023-08-18T11:24:06Z) - GLSFormer : Gated - Long, Short Sequence Transformer for Step
Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches.
We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z) - Robotic Navigation Autonomy for Subretinal Injection via Intelligent
Real-Time Virtual iOCT Volume Slicing [88.99939660183881]
We propose a framework for autonomous robotic navigation for subretinal injection.
Our method consists of an instrument pose estimation method, an online registration between the robotic and the i OCT system, and trajectory planning tailored for navigation to an injection target.
Our experiments on ex-vivo porcine eyes demonstrate the precision and repeatability of the method.
arXiv Detail & Related papers (2023-01-17T21:41:21Z) - One to Many: Adaptive Instrument Segmentation via Meta Learning and
Dynamic Online Adaptation in Robotic Surgical Video [71.43912903508765]
MDAL is a dynamic online adaptive learning scheme for instrument segmentation in robot-assisted surgery.
It learns the general knowledge of instruments and the fast adaptation ability through the video-specific meta-learning paradigm.
It outperforms other state-of-the-art methods on two datasets.
arXiv Detail & Related papers (2021-03-24T05:02:18Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery [10.562627972607892]
We show that it may be possible to use robot kinematic data coupled with laparoscopic images to alleviate the labelling problem.
We propose a new deep learning based model for parallel processing of both laparoscopic and simulation images.
arXiv Detail & Related papers (2020-07-17T16:33:33Z) - Searching for Efficient Architecture for Instrument Segmentation in
Robotic Surgery [58.63306322525082]
Most applications rely on accurate real-time segmentation of high-resolution surgical images.
We design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images.
arXiv Detail & Related papers (2020-07-08T21:38:29Z) - AP-MTL: Attention Pruned Multi-task Learning Model for Real-time
Instrument Detection and Segmentation in Robot-assisted Surgery [23.33984309289549]
Training a real-time robotic system for the detection and segmentation of high-resolution images provides a challenging problem with the limited computational resource.
We develop a novel end-to-end trainable real-time Multi-Task Learning model with weight-shared encoder and task-aware detection and segmentation decoders.
Our model significantly outperforms state-of-the-art segmentation and detection models, including best-performed models in the challenge.
arXiv Detail & Related papers (2020-03-10T14:24:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.