Unsupervised Temporal Video Segmentation as an Auxiliary Task for
Predicting the Remaining Surgery Duration
- URL: http://arxiv.org/abs/2002.11367v1
- Date: Wed, 26 Feb 2020 09:13:39 GMT
- Title: Unsupervised Temporal Video Segmentation as an Auxiliary Task for
Predicting the Remaining Surgery Duration
- Authors: Dominik Rivoir, Sebastian Bodenstedt, Felix von Bechtolsheim, Marius
Distler, J\"urgen Weitz, Stefanie Speidel
- Abstract summary: Estimating the remaining surgery duration (RSD) during surgical procedures can be useful for OR planning and anesthesia dose estimation.
We investigate whether RSD prediction can be improved using unsupervised temporal video segmentation as an auxiliary learning task.
- Score: 0.03131740922192113
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating the remaining surgery duration (RSD) during surgical procedures
can be useful for OR planning and anesthesia dose estimation. With the recent
success of deep learning-based methods in computer vision, several neural
network approaches have been proposed for fully automatic RSD prediction based
solely on visual data from the endoscopic camera. We investigate whether RSD
prediction can be improved using unsupervised temporal video segmentation as an
auxiliary learning task. As opposed to previous work, which presented
supervised surgical phase recognition as auxiliary task, we avoid the need for
manual annotations by proposing a similar but unsupervised learning objective
which clusters video sequences into temporally coherent segments. In multiple
experimental setups, results obtained by learning the auxiliary task are
incorporated into a deep RSD model through feature extraction, pretraining or
regularization. Further, we propose a novel loss function for RSD training
which attempts to counteract unfavorable characteristics of the RSD ground
truth. Using our unsupervised method as an auxiliary task for RSD training, we
outperform other self-supervised methods and are comparable to the supervised
state-of-the-art. Combined with the novel RSD loss, we slightly outperform the
supervised approach.
Related papers
- PitRSDNet: Predicting Intra-operative Remaining Surgery Duration in Endoscopic Pituitary Surgery [7.291847156946912]
This paper presents PitRSDNet for predicting Remaining Surgery Duration (RSD) during pituitary surgery.
PitRSDNet integrates workflow knowledge into RSD prediction in two forms: 1) multi-task learning for concurrently predicting step and RSD; and 2) prior steps as context in temporal learning and inference.
PitRSDNet is trained and evaluated on a new endoscopic pituitary surgery dataset with 88 videos to show competitive performance improvements over previous statistical and machine learning methods.
arXiv Detail & Related papers (2024-09-25T15:03:22Z) - DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery [71.6345505427213]
DPMesh is an innovative framework for occluded human mesh recovery.
It capitalizes on the profound diffusion prior about object structure and spatial relationships embedded in a pre-trained text-to-image diffusion model.
arXiv Detail & Related papers (2024-04-01T18:59:13Z) - RIDE: Self-Supervised Learning of Rotation-Equivariant Keypoint
Detection and Invariant Description for Endoscopy [83.4885991036141]
RIDE is a learning-based method for rotation-equivariant detection and invariant description.
It is trained in a self-supervised manner on a large curation of endoscopic images.
It sets a new state-of-the-art performance on matching and relative pose estimation tasks.
arXiv Detail & Related papers (2023-09-18T08:16:30Z) - A Survey of the Impact of Self-Supervised Pretraining for Diagnostic
Tasks with Radiological Images [71.26717896083433]
Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning.
This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging.
arXiv Detail & Related papers (2023-09-05T19:45:09Z) - Weakly Supervised Temporal Convolutional Networks for Fine-grained
Surgical Activity Recognition [10.080444283496487]
We propose to use coarser and easier-to-annotate activity labels, namely phases, as weak supervision to learn step recognition.
We employ a Single-Stage Temporal Convolutional Network (SS-TCN) with a ResNet-50 backbone, trained in an end-to-end fashion from weakly annotated videos.
We extensively evaluate and show the effectiveness of the proposed method on a large video dataset consisting of 40 laparoscopic gastric bypass procedures and the public benchmark CATARACTS containing 50 cataract surgeries.
arXiv Detail & Related papers (2023-02-21T17:26:49Z) - Self-Supervised-RCNN for Medical Image Segmentation with Limited Data
Annotation [0.16490701092527607]
We propose an alternative deep learning training strategy based on self-supervised pretraining on unlabeled MRI scans.
Our pretraining approach first, randomly applies different distortions to random areas of unlabeled images and then predicts the type of distortions and loss of information.
The effectiveness of the proposed method for segmentation tasks in different pre-training and fine-tuning scenarios is evaluated.
arXiv Detail & Related papers (2022-07-17T13:28:52Z) - Evaluating the Robustness of Self-Supervised Learning in Medical Imaging [57.20012795524752]
Self-supervision has demonstrated to be an effective learning strategy when training target tasks on small annotated data-sets.
We show that networks trained via self-supervised learning have superior robustness and generalizability compared to fully-supervised learning in the context of medical imaging.
arXiv Detail & Related papers (2021-05-14T17:49:52Z) - Zero-Shot Self-Supervised Learning for MRI Reconstruction [4.542616945567623]
We propose a zero-shot self-supervised learning approach to perform subject-specific accelerated DL MRI reconstruction.
The proposed approach partitions the available measurements from a single scan into three disjoint sets.
In the presence of models pre-trained on a database with different image characteristics, we show that the proposed approach can be combined with transfer learning for faster convergence time and reduced computational complexity.
arXiv Detail & Related papers (2021-02-15T18:34:38Z) - Self-Guided Multiple Instance Learning for Weakly Supervised Disease
Classification and Localization in Chest Radiographs [22.473965401043717]
We introduce a novel loss function for training convolutional neural networks increasing the emphlocalization confidence
We show that the supervision provided within the proposed learning scheme leads to better performance and more precise predictions on prevalent datasets for multiple-instance learning.
arXiv Detail & Related papers (2020-09-30T22:19:40Z) - Self-supervised Video Object Segmentation [76.83567326586162]
The objective of this paper is self-supervised representation learning, with the goal of solving semi-supervised video object segmentation (a.k.a. dense tracking)
We make the following contributions: (i) we propose to improve the existing self-supervised approach, with a simple, yet more effective memory mechanism for long-term correspondence matching; (ii) by augmenting the self-supervised approach with an online adaptation module, our method successfully alleviates tracker drifts caused by spatial-temporal discontinuity; (iv) we demonstrate state-of-the-art results among the self-supervised approaches on DAVIS-2017 and YouTube
arXiv Detail & Related papers (2020-06-22T17:55:59Z) - LRTD: Long-Range Temporal Dependency based Active Learning for Surgical
Workflow Recognition [67.86810761677403]
We propose a novel active learning method for cost-effective surgical video analysis.
Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency.
We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task.
arXiv Detail & Related papers (2020-04-21T09:21:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.