SurgeonAssist-Net: Towards Context-Aware Head-Mounted Display-Based
Augmented Reality for Surgical Guidance
- URL: http://arxiv.org/abs/2107.06397v1
- Date: Tue, 13 Jul 2021 21:12:34 GMT
- Title: SurgeonAssist-Net: Towards Context-Aware Head-Mounted Display-Based
Augmented Reality for Surgical Guidance
- Authors: Mitchell Doughty, Karan Singh, and Nilesh R. Ghugre
- Abstract summary: SurgeonAssist-Net is a framework making action-and-workflow-driven virtual assistance accessible to commercially available optical see-through head-mounted displays (OST-HMDs)
Our implementation competes with state-of-the-art approaches in prediction accuracy for automated task recognition.
It is capable of near real-time performance on the Microsoft HoloLens 2 OST-HMD.
- Score: 18.060445966264727
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present SurgeonAssist-Net: a lightweight framework making
action-and-workflow-driven virtual assistance, for a set of predefined surgical
tasks, accessible to commercially available optical see-through head-mounted
displays (OST-HMDs). On a widely used benchmark dataset for laparoscopic
surgical workflow, our implementation competes with state-of-the-art approaches
in prediction accuracy for automated task recognition, and yet requires 7.4x
fewer parameters, 10.2x fewer floating point operations per second (FLOPS), is
7.0x faster for inference on a CPU, and is capable of near real-time
performance on the Microsoft HoloLens 2 OST-HMD. To achieve this, we make use
of an efficient convolutional neural network (CNN) backbone to extract
discriminative features from image data, and a low-parameter recurrent neural
network (RNN) architecture to learn long-term temporal dependencies. To
demonstrate the feasibility of our approach for inference on the HoloLens 2 we
created a sample dataset that included video of several surgical tasks recorded
from a user-centric point-of-view. After training, we deployed our model and
cataloged its performance in an online simulated surgical scenario for the
prediction of the current surgical task. The utility of our approach is
explored in the discussion of several relevant clinical use-cases. Our code is
publicly available at https://github.com/doughtmw/surgeon-assist-net.
Related papers
- Exploring Optical Flow Inclusion into nnU-Net Framework for Surgical Instrument Segmentation [1.3444601218847545]
The nnU-Net framework excelled in semantic segmentation analyzing single frames without temporal information.
Optical flow (OF) is a tool commonly used in video tasks to estimate motion and represent it in a single frame, containing temporal information.
This work seeks to employ OF maps as an additional input to the nnU-Net architecture to improve its performance in the surgical instrument segmentation task.
arXiv Detail & Related papers (2024-03-15T11:36:26Z) - Efficient Adaptive Human-Object Interaction Detection with
Concept-guided Memory [64.11870454160614]
We propose an efficient Adaptive HOI Detector with Concept-guided Memory (ADA-CM)
ADA-CM has two operating modes. The first mode makes it tunable without learning new parameters in a training-free paradigm.
Our proposed method achieves competitive results with state-of-the-art on the HICO-DET and V-COCO datasets with much less training time.
arXiv Detail & Related papers (2023-09-07T13:10:06Z) - Bounded Future MS-TCN++ for surgical gesture recognition [0.0]
We learn the performance-delay trade-off and design an MS-TCN++-based algorithm that can utilize this trade-off.
The naive approach is to reduce the MS-TCN++ depth, as a result, the receptive field is reduced, and also the number of required future frames is also reduced.
This way, we have flexibility in the network design and as a result, we achieve significantly better performance than in the naive approach.
arXiv Detail & Related papers (2022-09-29T09:09:54Z) - Efficient Global-Local Memory for Real-time Instrument Segmentation of
Robotic Surgical Video [53.14186293442669]
We identify two important clues for surgical instrument perception, including local temporal dependency from adjacent frames and global semantic correlation in long-range duration.
We propose a novel dual-memory network (DMNet) to relate both global and local-temporal knowledge.
Our method largely outperforms the state-of-the-art works on segmentation accuracy while maintaining a real-time speed.
arXiv Detail & Related papers (2021-09-28T10:10:14Z) - Temporal Memory Relation Network for Workflow Recognition from Surgical
Video [53.20825496640025]
We propose a novel end-to-end temporal memory relation network (TMNet) for relating long-range and multi-scale temporal patterns.
We have extensively validated our approach on two benchmark surgical video datasets.
arXiv Detail & Related papers (2021-03-30T13:20:26Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Aggregating Long-Term Context for Learning Laparoscopic and
Robot-Assisted Surgical Workflows [40.48632897750319]
We propose a new temporal network structure that leverages task-specific network representation to collect long-term sufficient statistics.
We demonstrate superior results over existing and novel state-of-the-art segmentation techniques on two laparoscopic cholecystectomy datasets.
arXiv Detail & Related papers (2020-09-01T20:29:14Z) - Searching for Efficient Architecture for Instrument Segmentation in
Robotic Surgery [58.63306322525082]
Most applications rely on accurate real-time segmentation of high-resolution surgical images.
We design a light-weight and highly-efficient deep residual architecture which is tuned to perform real-time inference of high-resolution images.
arXiv Detail & Related papers (2020-07-08T21:38:29Z) - Automatic Operating Room Surgical Activity Recognition for
Robot-Assisted Surgery [1.1033115844630357]
We investigate automatic surgical activity recognition in robot-assisted operations.
We collect the first large-scale dataset including 400 full-length multi-perspective videos.
We densely annotate the videos with 10 most recognized and clinically relevant classes of activities.
arXiv Detail & Related papers (2020-06-29T16:30:31Z) - LRTD: Long-Range Temporal Dependency based Active Learning for Surgical
Workflow Recognition [67.86810761677403]
We propose a novel active learning method for cost-effective surgical video analysis.
Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency.
We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task.
arXiv Detail & Related papers (2020-04-21T09:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.