Surgical Gesture Recognition Based on Bidirectional Multi-Layer
Independently RNN with Explainable Spatial Feature Extraction
- URL: http://arxiv.org/abs/2105.00460v1
- Date: Sun, 2 May 2021 12:47:19 GMT
- Title: Surgical Gesture Recognition Based on Bidirectional Multi-Layer
Independently RNN with Explainable Spatial Feature Extraction
- Authors: Dandan Zhang, Ruoxi Wang, Benny Lo
- Abstract summary: We aim to develop an effective surgical gesture recognition approach with an explainable feature extraction process.
A Bidirectional Multi-Layer independently RNN (BML-indRNN) model is proposed in this paper.
To eliminate the black-box effects of DCNN, Gradient-weighted Class Activation Mapping (Grad-CAM) is employed.
Results indicated that the testing accuracy for the suturing task based on our proposed method is 87.13%, which outperforms most of the state-of-the-art algorithms.
- Score: 10.469989981471254
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Minimally invasive surgery mainly consists of a series of sub-tasks, which
can be decomposed into basic gestures or contexts. As a prerequisite of
autonomic operation, surgical gesture recognition can assist motion planning
and decision-making, and build up context-aware knowledge to improve the
surgical robot control quality. In this work, we aim to develop an effective
surgical gesture recognition approach with an explainable feature extraction
process. A Bidirectional Multi-Layer independently RNN (BML-indRNN) model is
proposed in this paper, while spatial feature extraction is implemented via
fine-tuning of a Deep Convolutional Neural Network(DCNN) model constructed
based on the VGG architecture. To eliminate the black-box effects of DCNN,
Gradient-weighted Class Activation Mapping (Grad-CAM) is employed. It can
provide explainable results by showing the regions of the surgical images that
have a strong relationship with the surgical gesture classification results.
The proposed method was evaluated based on the suturing task with data obtained
from the public available JIGSAWS database. Comparative studies were conducted
to verify the proposed framework. Results indicated that the testing accuracy
for the suturing task based on our proposed method is 87.13%, which outperforms
most of the state-of-the-art algorithms.
Related papers
- Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering.
Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively.
We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - Hierarchical Semi-Supervised Learning Framework for Surgical Gesture
Segmentation and Recognition Based on Multi-Modality Data [2.8770761243361593]
We develop a hierarchical semi-supervised learning framework for surgical gesture segmentation using multi-modality data.
A Transformer-based network with a pre-trained ResNet-18' backbone is used to extract visual features from the surgical operation videos.
The proposed approach has been evaluated using data from the publicly available JIGS database, including Suturing, Needle Passing, and Knot Tying tasks.
arXiv Detail & Related papers (2023-07-31T21:17:59Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Real-time landmark detection for precise endoscopic submucosal
dissection via shape-aware relation network [51.44506007844284]
We propose a shape-aware relation network for accurate and real-time landmark detection in endoscopic submucosal dissection surgery.
We first devise an algorithm to automatically generate relation keypoint heatmaps, which intuitively represent the prior knowledge of spatial relations among landmarks.
We then develop two complementary regularization schemes to progressively incorporate the prior knowledge into the training process.
arXiv Detail & Related papers (2021-11-08T07:57:30Z) - Automatic Liver Segmentation from CT Images Using Deep Learning
Algorithms: A Comparative Study [0.0]
This paper addresses to propose the most efficient DL architectures for Liver segmentation.
It is aimed to reveal the most effective and accurate DL architecture for fully automatic liver segmentation.
Results reveal that DL algorithms are able to automate organ segmentation from DICOM images with high accuracy.
arXiv Detail & Related papers (2021-01-25T10:05:46Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Unsupervised Region-based Anomaly Detection in Brain MRI with
Adversarial Image Inpainting [4.019851137611981]
This paper proposes a fully automatic, unsupervised inpainting-based brain tumour segmentation system for T1-weighted MRI.
First, a deep convolutional neural network (DCNN) is trained to reconstruct missing healthy brain regions. Then, anomalous regions are determined by identifying areas of highest reconstruction loss.
We show the proposed system is able to segment various sized and abstract tumours and achieves a mean and standard deviation Dice score of 0.771 and 0.176, respectively.
arXiv Detail & Related papers (2020-10-05T12:13:44Z) - Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and
Progress Prediction [17.63619129438996]
We propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress.
We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.
arXiv Detail & Related papers (2020-03-10T14:28:02Z) - Automatic Gesture Recognition in Robot-assisted Surgery with
Reinforcement Learning and Tree Search [63.07088785532908]
We propose a framework based on reinforcement learning and tree search for joint surgical gesture segmentation and classification.
Our framework consistently outperforms the existing methods on the suturing task of JIGSAWS dataset in terms of accuracy, edit score and F1 score.
arXiv Detail & Related papers (2020-02-20T13:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.