Related papers: SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks

SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks

URL: http://arxiv.org/abs/2308.13073v1
Date: Thu, 24 Aug 2023 20:32:57 GMT
Title: SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks
Authors: Shuja Khalid, Frank Rudzicz
Abstract summary: This paper explores how graph neural networks (GNNs) can be used to enhance visual scene understanding and surgical skill assessment. GNNs provide interpretable results, revealing the specific actions, instruments, or anatomical structures that contribute to the predicted skill metrics.
Score: 19.57785997767885
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper explores how graph neural networks (GNNs) can be used to enhance visual scene understanding and surgical skill assessment. By using GNNs to analyze the complex visual data of surgical procedures represented as graph structures, relevant features can be extracted and surgical skill can be predicted. Additionally, GNNs provide interpretable results, revealing the specific actions, instruments, or anatomical structures that contribute to the predicted skill metrics. This can be highly beneficial for surgical educators and trainees, as it provides valuable insights into the factors that contribute to successful surgical performance and outcomes. SurGNN proposes two concurrent approaches -- one supervised and the other self-supervised. The paper also briefly discusses other automated surgical skill evaluation techniques and highlights the limitations of hand-crafted features in capturing the intricacies of surgical expertise. We use the proposed methods to achieve state-of-the-art results on EndoVis19, and custom datasets. The working implementation of the code can be found at https://github.com/<redacted>.

Related papers

Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities [65.66373425605278]
Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events. Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases. This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure.
arXiv Detail & Related papers (2025-04-26T15:37:22Z)
EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery [52.992415247012296]
We introduce EndoChat to address various dialogue paradigms and subtasks in surgical scene understanding. Our model achieves state-of-the-art performance across five dialogue paradigms and eight surgical scene understanding tasks.
arXiv Detail & Related papers (2025-01-20T09:12:06Z)
Graph Neural Networks for Brain Graph Learning: A Survey [53.74244221027981]
Graph neural networks (GNNs) have demonstrated a significant advantage in mining graph-structured data. GNNs to learn brain graph representations for brain disorder analysis has recently gained increasing attention. In this paper, we aim to bridge this gap by reviewing brain graph learning works that utilize GNNs.
arXiv Detail & Related papers (2024-06-01T02:47:39Z)
Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery [50.3022015601057]
We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video. We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets. Our results demonstrate the superiority of our approach compared to unstructured alternatives.
arXiv Detail & Related papers (2024-02-03T00:58:05Z)
Dynamic Scene Graph Representation for Surgical Video [37.22552586793163]
We exploit scene graphs as a more holistic, semantically meaningful and human-readable way to represent surgical videos. We create a scene graph dataset from semantic segmentations from the CaDIS and CATARACTS datasets. We demonstrate the benefits of surgical scene graphs regarding the explainability and robustness of model decisions.
arXiv Detail & Related papers (2023-09-25T21:28:14Z)
Information Flow in Graph Neural Networks: A Clinical Triage Use Case [49.86931948849343]
Graph Neural Networks (GNNs) have gained popularity in healthcare and other domains due to their ability to process multi-modal and multi-relational graphs. We investigate how the flow of embedding information within GNNs affects the prediction of links in Knowledge Graphs (KGs) Our results demonstrate that incorporating domain knowledge into the GNN connectivity leads to better performance than using the same connectivity as the KG or allowing unconstrained embedding propagation.
arXiv Detail & Related papers (2023-09-12T09:18:12Z)
Surgical tool classification and localization: results and methods from the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge. The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools. We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z)
DynDepNet: Learning Time-Varying Dependency Structures from fMRI Data via Dynamic Graph Structure Learning [58.94034282469377]
We propose DynDepNet, a novel method for learning the optimal time-varying dependency structure of fMRI data induced by downstream prediction tasks. Experiments on real-world fMRI datasets, for the task of sex classification, demonstrate that DynDepNet achieves state-of-the-art results.
arXiv Detail & Related papers (2022-09-27T16:32:11Z)
Video-based assessment of intraoperative surgical skill [7.79874072121082]
We present and validate two deep learning methods that directly assess skill using RGB videos. In the first method, we predict instrument tips as keypoints, and learn surgical skill using temporal convolutional neural networks. In the second method, we propose a novel architecture for surgical skill assessment that includes a frame-wise encoder (2D convolutional neural network) followed by a temporal model (recurrent neural network)
arXiv Detail & Related papers (2022-05-13T01:45:22Z)
Surgical Gesture Recognition Based on Bidirectional Multi-Layer Independently RNN with Explainable Spatial Feature Extraction [10.469989981471254]
We aim to develop an effective surgical gesture recognition approach with an explainable feature extraction process. A Bidirectional Multi-Layer independently RNN (BML-indRNN) model is proposed in this paper. To eliminate the black-box effects of DCNN, Gradient-weighted Class Activation Mapping (Grad-CAM) is employed. Results indicated that the testing accuracy for the suturing task based on our proposed method is 87.13%, which outperforms most of the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-05-02T12:47:19Z)
Deep Neural Networks for the Assessment of Surgical Skills: A Systematic Review [6.815366422701539]
We have reviewed 530 papers, of which we selected 25 for this systematic review. We concluded that Deep Neural Networks are powerful tools for automated, objective surgical skill assessment using both kinematic and video data. The field would benefit from large, publicly available, annotated datasets that are representative of the surgical trainee and expert demographics and multimodal data beyond kinematics and videos.
arXiv Detail & Related papers (2021-03-03T10:08:37Z)
Learning and Reasoning with the Graph Structure Representation in Robotic Surgery [15.490603884631764]
Learning to infer graph representations can play a vital role in surgical scene understanding in robotic surgery. We develop an approach to generate the scene graph and predict surgical interactions between instruments and surgical region of interest.
arXiv Detail & Related papers (2020-07-07T11:49:34Z)
Node Masking: Making Graph Neural Networks Generalize and Scale Better [71.51292866945471]
Graph Neural Networks (GNNs) have received a lot of interest in the recent times. In this paper, we utilize some theoretical tools to better visualize the operations performed by state of the art spatial GNNs. We introduce a simple concept, Node Masking, that allows them to generalize and scale better.
arXiv Detail & Related papers (2020-01-17T06:26:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.