Using Hand Pose Estimation To Automate Open Surgery Training Feedback
- URL: http://arxiv.org/abs/2211.07021v2
- Date: Thu, 30 Mar 2023 19:14:54 GMT
- Title: Using Hand Pose Estimation To Automate Open Surgery Training Feedback
- Authors: Eddie Bkheet, Anne-Lise D'Angelo, Adam Goldbraikh, Shlomi Laufer
- Abstract summary: This research aims to facilitate the use of state-of-the-art computer vision algorithms for the automated training of surgeons.
By estimating 2D hand poses, we model the movement of the practitioner's hands, and their interaction with surgical instruments.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Purpose: This research aims to facilitate the use of state-of-the-art
computer vision algorithms for the automated training of surgeons and the
analysis of surgical footage. By estimating 2D hand poses, we model the
movement of the practitioner's hands, and their interaction with surgical
instruments, to study their potential benefit for surgical training.
Methods: We leverage pre-trained models on a publicly-available hands dataset
to create our own in-house dataset of 100 open surgery simulation videos with
2D hand poses. We also assess the ability of pose estimations to segment
surgical videos into gestures and tool-usage segments and compare them to
kinematic sensors and I3D features. Furthermore, we introduce 6 novel surgical
dexterity proxies stemming from domain experts' training advice, all of which
our framework can automatically detect given raw video footage.
Results: State-of-the-art gesture segmentation accuracy of 88.35\% on the
Open Surgery Simulation dataset is achieved with the fusion of 2D poses and I3D
features from multiple angles. The introduced surgical skill proxies presented
significant differences for novices compared to experts and produced actionable
feedback for improvement.
Conclusion: This research demonstrates the benefit of pose estimations for
open surgery by analyzing their effectiveness in gesture segmentation and skill
assessment. Gesture segmentation using pose estimations achieved comparable
results to physical sensors while being remote and markerless. Surgical
dexterity proxies that rely on pose estimation proved they can be used to work
towards automated training feedback. We hope our findings encourage additional
collaboration on novel skill proxies to make surgical training more efficient.
Related papers
- Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment [66.6041949490137]
We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness.
Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes.
Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
arXiv Detail & Related papers (2024-11-17T00:13:00Z) - Automated Surgical Skill Assessment in Endoscopic Pituitary Surgery using Real-time Instrument Tracking on a High-fidelity Bench-top Phantom [9.41936397281689]
Improved surgical skill is generally associated with improved patient outcomes, but assessment is subjective and labour-intensive.
A new public dataset is introduced, focusing on simulated surgery, using the nasal phase of endoscopic pituitary surgery as an exemplar.
A Multilayer Perceptron achieved 87% accuracy in predicting surgical skill level (novice or expert), with the "ratio of total procedure time to instrument visible time" correlated with higher surgical skill.
arXiv Detail & Related papers (2024-09-25T15:27:44Z) - Realistic Data Generation for 6D Pose Estimation of Surgical Instruments [4.226502078427161]
6D pose estimation of surgical instruments is critical to enable the automatic execution of surgical maneuvers.
In household and industrial settings, synthetic data, generated with 3D computer graphics software, has been shown as an alternative to minimize annotation costs.
We propose an improved simulation environment for surgical robotics that enables the automatic generation of large and diverse datasets.
arXiv Detail & Related papers (2024-06-11T14:59:29Z) - Creating a Digital Twin of Spinal Surgery: A Proof of Concept [68.37190859183663]
Surgery digitalization is the process of creating a virtual replica of real-world surgery.
We present a proof of concept (PoC) for surgery digitalization that is applied to an ex-vivo spinal surgery.
We employ five RGB-D cameras for dynamic 3D reconstruction of the surgeon, a high-end camera for 3D reconstruction of the anatomy, an infrared stereo camera for surgical instrument tracking, and a laser scanner for 3D reconstruction of the operating room and data fusion.
arXiv Detail & Related papers (2024-03-25T13:09:40Z) - SAR-RARP50: Segmentation of surgical instrumentation and Action
Recognition on Robot-Assisted Radical Prostatectomy Challenge [72.97934765570069]
We release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP)
The aim of the challenge is to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain.
A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation.
arXiv Detail & Related papers (2023-12-31T13:32:18Z) - Visual-Kinematics Graph Learning for Procedure-agnostic Instrument Tip
Segmentation in Robotic Surgeries [29.201385352740555]
We propose a novel visual-kinematics graph learning framework to accurately segment the instrument tip given various surgical procedures.
Specifically, a graph learning framework is proposed to encode relational features of instrument parts from both image and kinematics.
A cross-modal contrastive loss is designed to incorporate robust geometric prior from kinematics to image for tip segmentation.
arXiv Detail & Related papers (2023-09-02T14:52:58Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose
Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras.
Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.
Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Using Computer Vision to Automate Hand Detection and Tracking of Surgeon
Movements in Videos of Open Surgery [8.095095522269352]
We leverage advances in computer vision to introduce an automated approach to video analysis of surgical execution.
A state-of-the-art convolutional neural network architecture for object detection was used to detect operating hands in open surgery videos.
Our model's spatial detections of operating hands significantly outperforms the detections achieved using pre-existing hand-detection datasets.
arXiv Detail & Related papers (2020-12-13T03:10:09Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Recurrent and Spiking Modeling of Sparse Surgical Kinematics [0.8458020117487898]
A growing number of studies have used machine learning to analyze video and kinematic data captured from surgical robots.
In this study, we explore the possibility of using only kinematic data to predict surgeons of similar skill levels.
We report that it is possible to identify surgical fellows receiving near perfect scores in the simulation exercises based on their motion characteristics alone.
arXiv Detail & Related papers (2020-05-12T15:41:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.