Adaptation of Surgical Activity Recognition Models Across Operating
Rooms
- URL: http://arxiv.org/abs/2207.03083v1
- Date: Thu, 7 Jul 2022 04:41:34 GMT
- Title: Adaptation of Surgical Activity Recognition Models Across Operating
Rooms
- Authors: Ali Mottaghi, Aidean Sharghi, Serena Yeung, Omid Mohareri
- Abstract summary: We study the generalizability of surgical activity recognition models across operating rooms.
We propose a new domain adaptation method to improve the performance of the surgical activity recognition model.
- Score: 10.625208343893911
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic surgical activity recognition enables more intelligent surgical
devices and a more efficient workflow. Integration of such technology in new
operating rooms has the potential to improve care delivery to patients and
decrease costs. Recent works have achieved a promising performance on surgical
activity recognition; however, the lack of generalizability of these models is
one of the critical barriers to the wide-scale adoption of this technology. In
this work, we study the generalizability of surgical activity recognition
models across operating rooms. We propose a new domain adaptation method to
improve the performance of the surgical activity recognition model in a new
operating room for which we only have unlabeled videos. Our approach generates
pseudo labels for unlabeled video clips that it is confident about and trains
the model on the augmented version of the clips. We extend our method to a
semi-supervised domain adaptation setting where a small portion of the target
domain is also labeled. In our experiments, our proposed method consistently
outperforms the baselines on a dataset of more than 480 long surgical videos
collected from two operating rooms.
Related papers
- ST(OR)2: Spatio-Temporal Object Level Reasoning for Activity Recognition
in the Operating Room [6.132617753806978]
We propose a new sample-efficient and object-based approach for surgical activity recognition in the OR.
Our method focuses on the geometric arrangements between clinicians and surgical devices, thus utilizing the significant object interaction dynamics in the OR.
arXiv Detail & Related papers (2023-12-19T15:33:57Z) - GLSFormer : Gated - Long, Short Sequence Transformer for Step
Recognition in Surgical Videos [57.93194315839009]
We propose a vision transformer-based approach to learn temporal features directly from sequence-level patches.
We extensively evaluate our approach on two cataract surgery video datasets, Cataract-101 and D99, and demonstrate superior performance compared to various state-of-the-art methods.
arXiv Detail & Related papers (2023-07-20T17:57:04Z) - Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community.
The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored.
We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z) - Surgical Phase Recognition in Laparoscopic Cholecystectomy [57.929132269036245]
We propose a Transformer-based method that utilizes calibrated confidence scores for a 2-stage inference pipeline.
Our method outperforms the baseline model on the Cholec80 dataset, and can be applied to a variety of action segmentation methods.
arXiv Detail & Related papers (2022-06-14T22:55:31Z) - Quantification of Robotic Surgeries with Vision-Based Deep Learning [45.165919577877695]
We propose a unified deep learning framework, entitled Roboformer, which operates exclusively on videos recorded during surgery.
We validated our framework on four video-based datasets of two commonly-encountered types of steps within minimally-invasive robotic surgeries.
arXiv Detail & Related papers (2022-05-06T06:08:35Z) - CholecTriplet2021: A benchmark challenge for surgical action triplet
recognition [66.51610049869393]
This paper presents CholecTriplet 2021: an endoscopic vision challenge organized at MICCAI 2021 for the recognition of surgical action triplets in laparoscopic videos.
We present the challenge setup and assessment of the state-of-the-art deep learning methods proposed by the participants during the challenge.
A total of 4 baseline methods and 19 new deep learning algorithms are presented to recognize surgical action triplets directly from surgical videos, achieving mean average precision (mAP) ranging from 4.2% to 38.1%.
arXiv Detail & Related papers (2022-04-10T18:51:55Z) - Know your sensORs $\unicode{x2013}$ A Modality Study For Surgical Action
Classification [39.546197658791]
The medical community seeks to leverage this wealth of data to develop automated methods to advance interventional care, lower costs, and improve patient outcomes.
Existing datasets from OR room cameras are thus far limited in size or modalities acquired, leaving it unclear which sensor modalities are best suited for tasks such as recognizing surgical action from videos.
This study demonstrates that surgical action recognition performance can vary depending on the image modalities used.
arXiv Detail & Related papers (2022-03-16T15:01:17Z) - A real-time spatiotemporal AI model analyzes skill in open surgical
videos [2.4907439112059278]
Our work overcomes existing data limitations for training AI models by curating, from YouTube, the largest dataset of open surgical videos to date: 1997 videos from 23 surgical procedures uploaded from 50 countries.
We developed a multi-task AI model capable of real-time understanding of surgical behaviors, hands, and tools - the building blocks of procedural flow and surgeon skill.
arXiv Detail & Related papers (2021-12-14T08:11:02Z) - One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z) - Using Computer Vision to Automate Hand Detection and Tracking of Surgeon
Movements in Videos of Open Surgery [8.095095522269352]
We leverage advances in computer vision to introduce an automated approach to video analysis of surgical execution.
A state-of-the-art convolutional neural network architecture for object detection was used to detect operating hands in open surgery videos.
Our model's spatial detections of operating hands significantly outperforms the detections achieved using pre-existing hand-detection datasets.
arXiv Detail & Related papers (2020-12-13T03:10:09Z) - Automatic Operating Room Surgical Activity Recognition for
Robot-Assisted Surgery [1.1033115844630357]
We investigate automatic surgical activity recognition in robot-assisted operations.
We collect the first large-scale dataset including 400 full-length multi-perspective videos.
We densely annotate the videos with 10 most recognized and clinically relevant classes of activities.
arXiv Detail & Related papers (2020-06-29T16:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.