Dynamic Interactive Relation Capturing via Scene Graph Learning for
Robotic Surgical Report Generation
- URL: http://arxiv.org/abs/2306.02651v1
- Date: Mon, 5 Jun 2023 07:34:41 GMT
- Title: Dynamic Interactive Relation Capturing via Scene Graph Learning for
Robotic Surgical Report Generation
- Authors: Hongqiu Wang, Yueming Jin, Lei Zhu
- Abstract summary: For robot-assisted surgery, an accurate surgical report reflects clinical operations during surgery and helps document entry tasks, post-operative analysis and follow-up treatment.
It is a challenging task due to many complex and diverse interactions between instruments and tissues in the surgical scene.
This paper presents a neural network to boost surgical report generation by explicitly exploring the interactive relation between tissues and surgical instruments.
- Score: 14.711668177329244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For robot-assisted surgery, an accurate surgical report reflects clinical
operations during surgery and helps document entry tasks, post-operative
analysis and follow-up treatment. It is a challenging task due to many complex
and diverse interactions between instruments and tissues in the surgical scene.
Although existing surgical report generation methods based on deep learning
have achieved large success, they often ignore the interactive relation between
tissues and instrumental tools, thereby degrading the report generation
performance. This paper presents a neural network to boost surgical report
generation by explicitly exploring the interactive relation between tissues and
surgical instruments. We validate the effectiveness of our method on a
widely-used robotic surgery benchmark dataset, and experimental results show
that our network can significantly outperform existing state-of-the-art
surgical report generation methods (e.g., 7.48% and 5.43% higher for BLEU-1 and
ROUGE).
Related papers
- Instrument-tissue Interaction Detection Framework for Surgical Video Understanding [31.822025965225016]
We present an Instrument-Tissue Interaction Detection Network (ITIDNet) to detect the quintuple for surgery videos understanding.
Specifically, we propose a Snippet Consecutive Feature (SCF) Layer to enhance features by modeling relationships of proposals in the current frame using global context information in the video snippet.
To reason relationships between instruments and tissues, a Temporal Graph (TG) Layer is proposed with intra-frame connections to exploit relationships between instruments and tissues in the same frame and inter-frame connections to model the temporal information for the same instance.
arXiv Detail & Related papers (2024-03-30T11:21:11Z) - Hypergraph-Transformer (HGT) for Interactive Event Prediction in
Laparoscopic and Robotic Surgery [50.3022015601057]
We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video.
We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets.
Our results demonstrate the superiority of our approach compared to unstructured alternatives.
arXiv Detail & Related papers (2024-02-03T00:58:05Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition
in the Operating Room [6.132617753806978]
We propose a new sample-efficient and object-based approach for surgical activity recognition in the OR.
Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR.
arXiv Detail & Related papers (2023-12-19T15:33:57Z) - Surgical tool classification and localization: results and methods from
the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge.
The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools.
We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Towards Unified Surgical Skill Assessment [18.601526803020885]
We propose a unified multi-path framework for automatic surgical skill assessment.
We conduct experiments on the JIGSAWS dataset of simulated surgical tasks, and a new clinical dataset of real laparoscopic surgeries.
arXiv Detail & Related papers (2021-06-02T09:06:43Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and
Progress Prediction [17.63619129438996]
We propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress.
We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.
arXiv Detail & Related papers (2020-03-10T14:28:02Z) - SuPer Deep: A Surgical Perception Framework for Robotic Tissue
Manipulation using Deep Learning for Feature Extraction [25.865648975312407]
We exploit deep learning methods for surgical perception.
We integrated deep neural networks, capable of efficient feature extraction, into the tissue reconstruction and instrument pose estimation processes.
Our framework achieves state-of-the-art tracking performance in a surgical environment by utilizing deep learning for feature extraction.
arXiv Detail & Related papers (2020-03-07T00:08:30Z) - Automatic Gesture Recognition in Robot-assisted Surgery with
Reinforcement Learning and Tree Search [63.07088785532908]
We propose a framework based on reinforcement learning and tree search for joint surgical gesture segmentation and classification.
Our framework consistently outperforms the existing methods on the suturing task of JIGSAWS dataset in terms of accuracy, edit score and F1 score.
arXiv Detail & Related papers (2020-02-20T13:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.