Using Computer Vision to Automate Hand Detection and Tracking of Surgeon
Movements in Videos of Open Surgery
- URL: http://arxiv.org/abs/2012.06948v1
- Date: Sun, 13 Dec 2020 03:10:09 GMT
- Title: Using Computer Vision to Automate Hand Detection and Tracking of Surgeon
Movements in Videos of Open Surgery
- Authors: Michael Zhang, Xiaotian Cheng, Daniel Copeland, Arjun Desai, Melody Y.
Guan, Gabriel A. Brat, and Serena Yeung
- Abstract summary: We leverage advances in computer vision to introduce an automated approach to video analysis of surgical execution.
A state-of-the-art convolutional neural network architecture for object detection was used to detect operating hands in open surgery videos.
Our model's spatial detections of operating hands significantly outperforms the detections achieved using pre-existing hand-detection datasets.
- Score: 8.095095522269352
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open, or non-laparoscopic surgery, represents the vast majority of all
operating room procedures, but few tools exist to objectively evaluate these
techniques at scale. Current efforts involve human expert-based visual
assessment. We leverage advances in computer vision to introduce an automated
approach to video analysis of surgical execution. A state-of-the-art
convolutional neural network architecture for object detection was used to
detect operating hands in open surgery videos. Automated assessment was
expanded by combining model predictions with a fast object tracker to enable
surgeon-specific hand tracking. To train our model, we used publicly available
videos of open surgery from YouTube and annotated these with spatial bounding
boxes of operating hands. Our model's spatial detections of operating hands
significantly outperforms the detections achieved using pre-existing
hand-detection datasets, and allow for insights into intra-operative movement
patterns and economy of motion.
Related papers
- Hypergraph-Transformer (HGT) for Interactive Event Prediction in
Laparoscopic and Robotic Surgery [50.3022015601057]
We propose a predictive neural network that is capable of understanding and predicting critical interactive aspects of surgical workflow from intra-abdominal video.
We verify our approach on established surgical datasets and applications, including the detection and prediction of action triplets.
Our results demonstrate the superiority of our approach compared to unstructured alternatives.
arXiv Detail & Related papers (2024-02-03T00:58:05Z) - Surgical tool classification and localization: results and methods from
the MICCAI 2022 SurgToolLoc challenge [69.91670788430162]
We present the results of the SurgLoc 2022 challenge.
The goal was to leverage tool presence data as weak labels for machine learning models trained to detect tools.
We conclude by discussing these results in the broader context of machine learning and surgical data science.
arXiv Detail & Related papers (2023-05-11T21:44:39Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose
Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras.
Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.
Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - Live image-based neurosurgical guidance and roadmap generation using
unsupervised embedding [53.992124594124896]
We present a method for live image-only guidance leveraging a large data set of annotated neurosurgical videos.
A generated roadmap encodes the common anatomical paths taken in surgeries in the training set.
We trained and evaluated the proposed method with a data set of 166 transsphenoidal adenomectomy procedures.
arXiv Detail & Related papers (2023-03-31T12:52:24Z) - Using Hand Pose Estimation To Automate Open Surgery Training Feedback [0.0]
This research aims to facilitate the use of state-of-the-art computer vision algorithms for the automated training of surgeons.
By estimating 2D hand poses, we model the movement of the practitioner's hands, and their interaction with surgical instruments.
arXiv Detail & Related papers (2022-11-13T21:47:31Z) - Adaptation of Surgical Activity Recognition Models Across Operating
Rooms [10.625208343893911]
We study the generalizability of surgical activity recognition models across operating rooms.
We propose a new domain adaptation method to improve the performance of the surgical activity recognition model.
arXiv Detail & Related papers (2022-07-07T04:41:34Z) - Using Human Gaze For Surgical Activity Recognition [0.40611352512781856]
We propose to use human gaze with a spatial temporal attention mechanism for activity recognition in surgical videos.
Our model consists of an I3D-based architecture, learns temporal features using 3D convolutions, as well as learning an attention map using human gaze.
arXiv Detail & Related papers (2022-03-09T14:28:00Z) - A real-time spatiotemporal AI model analyzes skill in open surgical
videos [2.4907439112059278]
Our work overcomes existing data limitations for training AI models by curating, from YouTube, the largest dataset of open surgical videos to date: 1997 videos from 23 surgical procedures uploaded from 50 countries.
We developed a multi-task AI model capable of real-time understanding of surgical behaviors, hands, and tools - the building blocks of procedural flow and surgeon skill.
arXiv Detail & Related papers (2021-12-14T08:11:02Z) - Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical
Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views.
We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z) - Relational Graph Learning on Visual and Kinematics Embeddings for
Accurate Gesture Recognition in Robotic Surgery [84.73764603474413]
We propose a novel online approach of multi-modal graph network (i.e., MRG-Net) to dynamically integrate visual and kinematics information.
The effectiveness of our method is demonstrated with state-of-the-art results on the public JIGSAWS dataset.
arXiv Detail & Related papers (2020-11-03T11:00:10Z) - Automatic Operating Room Surgical Activity Recognition for
Robot-Assisted Surgery [1.1033115844630357]
We investigate automatic surgical activity recognition in robot-assisted operations.
We collect the first large-scale dataset including 400 full-length multi-perspective videos.
We densely annotate the videos with 10 most recognized and clinically relevant classes of activities.
arXiv Detail & Related papers (2020-06-29T16:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.