Related papers: Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery

Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery

URL: http://arxiv.org/abs/2402.01974v1
Date: Sat, 3 Feb 2024 00:58:05 GMT
Title: Hypergraph-Transformer (HGT) for Interactive Event Prediction in Laparoscopic and Robotic Surgery
Authors: Lianhao Yin, Yutong Ban, Jennifer Eckhoff, Ozanan Meireles, Daniela Rus, Guy Rosman
Abstract summary: We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video. We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets. Our results demonstrate the superiority of our approach compared to unstructured alternatives.
Score: 50.3022015601057
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding and anticipating intraoperative events and actions is critical for intraoperative assistance and decision-making during minimally invasive surgery. Automated prediction of events, actions, and the following consequences is addressed through various computational approaches with the objective of augmenting surgeons' perception and decision-making capabilities. We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video, while flexibly leveraging surgical knowledge graphs. The approach incorporates a hypergraph-transformer (HGT) structure that encodes expert knowledge into the network design and predicts the hidden embedding of the graph. We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets, and the achievement of the Critical View of Safety (CVS). Moreover, we address specific, safety-related tasks, such as predicting the clipping of cystic duct or artery without prior achievement of the CVS. Our results demonstrate the superiority of our approach compared to unstructured alternatives.

Related papers

Surgeons vs. Computer Vision: A comparative analysis on surgical phase recognition capabilities [65.66373425605278]
Automated Surgical Phase Recognition (SPR) uses Artificial Intelligence (AI) to segment the surgical workflow into its key events. Previous research has focused on short and linear surgical procedures and has not explored if temporal context influences experts' ability to better classify surgical phases. This research addresses these gaps, focusing on Robot-Assisted Partial Nephrectomy (RAPN) as a highly non-linear procedure.
arXiv Detail & Related papers (2025-04-26T15:37:22Z)
SurgRAW: Multi-Agent Workflow with Chain-of-Thought Reasoning for Surgical Intelligence [16.584722724845182]
Integration of Vision-Language Models in surgical intelligence is hindered by hallucinations, domain knowledge gaps, and limited understanding of task interdependencies. We present SurgRAW, a CoT-driven multi-agent framework that delivers transparent, interpretable insights for most tasks in robotic-assisted surgery.
arXiv Detail & Related papers (2025-03-13T11:23:13Z)
SurgicalVLM-Agent: Towards an Interactive AI Co-Pilot for Pituitary Surgery [15.263720052126853]
Large vision-language models (VLMs) offer a promising solution by enabling dynamic task planning and predictive decision support. We introduce SurgicalVLM-Agent, an AI co-pilot for image-guided pituitary surgery, capable of conversation, planning, and task execution.
arXiv Detail & Related papers (2025-03-12T15:30:39Z)
Adaptive Graph Learning from Spatial Information for Surgical Workflow Anticipation [9.329654505950199]
We propose an adaptive graph learning framework for surgical workflow anticipation based on a novel spatial representation. We develop a multi-horizon objective that balances learning objectives for different time horizons, allowing for unconstrained predictions.
arXiv Detail & Related papers (2024-12-09T12:53:08Z)
VISAGE: Video Synthesis using Action Graphs for Surgery [34.21344214645662]
We introduce the novel task of future video generation in laparoscopic surgery. Our proposed method, VISAGE, leverages the power of action scene graphs to capture the sequential nature of laparoscopic procedures. Results of our experiments demonstrate high-fidelity video generation for laparoscopy procedures.
arXiv Detail & Related papers (2024-10-23T10:28:17Z)
Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation [51.222684687924215]
Surgical video-language pretraining faces unique challenges due to the knowledge domain gap and the scarcity of multi-modal data. We propose a hierarchical knowledge augmentation approach and a novel Procedure-Encoded Surgical Knowledge-Augmented Video-Language Pretraining framework to tackle these issues.
arXiv Detail & Related papers (2024-09-30T22:21:05Z)
Event Recognition in Laparoscopic Gynecology Videos with Hybrid Transformers [4.371909393924804]
We introduce a dataset tailored for relevant event recognition in laparoscopic videos. Our dataset includes annotations for critical events associated with major intra-operative challenges and post-operative complications. We evaluate a hybrid transformer architecture coupled with a customized training-inference framework to recognize four specific events in laparoscopic surgery videos.
arXiv Detail & Related papers (2023-12-01T13:57:29Z)
SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks [19.57785997767885]
This paper explores how graph neural networks (GNNs) can be used to enhance visual scene understanding and surgical skill assessment. GNNs provide interpretable results, revealing the specific actions, instruments, or anatomical structures that contribute to the predicted skill metrics.
arXiv Detail & Related papers (2023-08-24T20:32:57Z)
Prediction of Post-Operative Renal and Pulmonary Complications Using Transformers [69.81176740997175]
We evaluate the performance of transformer-based models in predicting postoperative acute renal failure, pulmonary complications, and postoperative in-hospital mortality. Our results demonstrate that transformer-based models can achieve superior performance in predicting postoperative complications and outperform traditional machine learning models.
arXiv Detail & Related papers (2023-06-01T14:08:05Z)
CholecTriplet2021: A benchmark challenge for surgical action triplet recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos. We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge. A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z)
SUrgical PRediction GAN for Events Anticipation [38.65189355224683]
We used a novel GAN formulation that sampled the future surgical phases trajectory conditioned, on past laparoscopic video frames. We demonstrated its effectiveness in inferring and predicting the progress of laparoscopic cholecystectomy videos. We surveyed surgeons to evaluate the plausibility of these predicted trajectories.
arXiv Detail & Related papers (2021-05-10T19:56:45Z)
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks. Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets. We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
TeCNO: Surgical Phase Recognition with Multi-Stage Temporal Convolutional Networks [43.95869213955351]
We propose a Multi-Stage Temporal Convolutional Network (MS-TCN) that performs hierarchical prediction refinement for surgical phase recognition. Our method is thoroughly evaluated on two datasets of laparoscopic cholecystectomy videos with and without the use of additional surgical tool information.
arXiv Detail & Related papers (2020-03-24T10:12:30Z)
Robust Medical Instrument Segmentation Challenge 2019 [56.148440125599905]
Intraoperative tracking of laparoscopic instruments is often a prerequisite for computer and robotic-assisted interventions. Our challenge was based on a surgical data set comprising 10,040 annotated images acquired from a total of 30 surgical procedures. The results confirm the initial hypothesis, namely that algorithm performance degrades with an increasing domain gap.
arXiv Detail & Related papers (2020-03-23T14:35:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.