Deep Homography Estimation in Dynamic Surgical Scenes for Laparoscopic
Camera Motion Extraction
- URL: http://arxiv.org/abs/2109.15098v1
- Date: Thu, 30 Sep 2021 13:05:37 GMT
- Title: Deep Homography Estimation in Dynamic Surgical Scenes for Laparoscopic
Camera Motion Extraction
- Authors: Martin Huber, S\'ebastien Ourselin, Christos Bergeles, Tom Vercauteren
- Abstract summary: We introduce a method that allows to extract a laparoscope holder's actions from videos of laparoscopic interventions.
We synthetically add camera motion to a newly acquired dataset of camera motion free da Vinci surgery image sequences.
We find our method transfers from our camera motion free da Vinci surgery dataset to videos of laparoscopic interventions, outperforming classical homography estimation approaches in both, precision by 41%, and runtime on a CPU by 43%.
- Score: 6.56651216023737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current laparoscopic camera motion automation relies on rule-based approaches
or only focuses on surgical tools. Imitation Learning (IL) methods could
alleviate these shortcomings, but have so far been applied to oversimplified
setups. Instead of extracting actions from oversimplified setups, in this work
we introduce a method that allows to extract a laparoscope holder's actions
from videos of laparoscopic interventions. We synthetically add camera motion
to a newly acquired dataset of camera motion free da Vinci surgery image
sequences through the introduction of a novel homography generation algorithm.
The synthetic camera motion serves as a supervisory signal for camera motion
estimation that is invariant to object and tool motion. We perform an extensive
evaluation of state-of-the-art (SOTA) Deep Neural Networks (DNNs) across
multiple compute regimes, finding our method transfers from our camera motion
free da Vinci surgery dataset to videos of laparoscopic interventions,
outperforming classical homography estimation approaches in both, precision by
41%, and runtime on a CPU by 43%.
Related papers
- VISAGE: Video Synthesis using Action Graphs for Surgery [34.21344214645662]
We introduce the novel task of future video generation in laparoscopic surgery.
Our proposed method, VISAGE, leverages the power of action scene graphs to capture the sequential nature of laparoscopic procedures.
Results of our experiments demonstrate high-fidelity video generation for laparoscopy procedures.
arXiv Detail & Related papers (2024-10-23T10:28:17Z) - FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos [79.50191812646125]
Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training.
We adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue.
We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch.
This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information
arXiv Detail & Related papers (2024-03-18T19:13:02Z) - AiAReSeg: Catheter Detection and Segmentation in Interventional
Ultrasound using Transformers [75.20925220246689]
endovascular surgeries are performed using the golden standard of Fluoroscopy, which uses ionising radiation to visualise catheters and vasculature.
This work proposes a solution using an adaptation of a state-of-the-art machine learning transformer architecture to detect and segment catheters in axial interventional Ultrasound image sequences.
arXiv Detail & Related papers (2023-09-25T19:34:12Z) - Next-generation Surgical Navigation: Marker-less Multi-view 6DoF Pose
Estimation of Surgical Instruments [66.74633676595889]
We present a multi-camera capture setup consisting of static and head-mounted cameras.
Second, we publish a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured in a surgical wet lab and a real operating theatre.
Third, we evaluate three state-of-the-art single-view and multi-view methods for the task of 6DoF pose estimation of surgical instruments.
arXiv Detail & Related papers (2023-05-05T13:42:19Z) - Learning How To Robustly Estimate Camera Pose in Endoscopic Videos [5.073761189475753]
We propose a solution for stereo endoscopes that estimates depth and optical flow to minimize two geometric losses for camera pose estimation.
Most importantly, we introduce two learned adaptive per-pixel weight mappings that balance contributions according to the input image content.
We validate our approach on the publicly available SCARED dataset and introduce a new in-vivo dataset, StereoMIS.
arXiv Detail & Related papers (2023-04-17T07:05:01Z) - AutoLaparo: A New Dataset of Integrated Multi-tasks for Image-guided
Surgical Automation in Laparoscopic Hysterectomy [42.20922574566824]
We present and release the first integrated dataset with multiple image-based perception tasks to facilitate learning-based automation in hysterectomy surgery.
Our AutoLaparo dataset is developed based on full-length videos of entire hysterectomy procedures.
Specifically, three different yet highly correlated tasks are formulated in the dataset, including surgical workflow recognition, laparoscope motion prediction, and instrument and key anatomy segmentation.
arXiv Detail & Related papers (2022-08-03T13:17:23Z) - Predicting the Timing of Camera Movements From the Kinematics of
Instruments in Robotic-Assisted Surgery Using Artificial Neural Networks [1.0965065178451106]
We propose a predictive approach for anticipating when camera movements will occur using artificial neural networks.
We used the kinematic data of the surgical instruments, which were recorded during robotic-assisted surgical training on porcine models.
We found that the instruments' kinematic data can be used to predict when camera movements will occur, and evaluated the performance on different segment durations and ensemble sizes.
arXiv Detail & Related papers (2021-09-23T07:57:27Z) - E-DSSR: Efficient Dynamic Surgical Scene Reconstruction with
Transformer-based Stereoscopic Depth Perception [15.927060244702686]
We present an efficient reconstruction pipeline for highly dynamic surgical scenes that runs at 28 fps.
Specifically, we design a transformer-based stereoscopic depth perception for efficient depth estimation.
We evaluate the proposed pipeline on two datasets, the public Hamlyn Centre Endoscopic Video dataset and our in-house DaVinci robotic surgery dataset.
arXiv Detail & Related papers (2021-07-01T05:57:41Z) - One-shot action recognition towards novel assistive therapies [63.23654147345168]
This work is motivated by the automated analysis of medical therapies that involve action imitation games.
The presented approach incorporates a pre-processing step that standardizes heterogeneous motion data conditions.
We evaluate the approach on a real use-case of automated video analysis for therapy support with autistic people.
arXiv Detail & Related papers (2021-02-17T19:41:37Z) - Learning Motion Flows for Semi-supervised Instrument Segmentation from
Robotic Surgical Video [64.44583693846751]
We study the semi-supervised instrument segmentation from robotic surgical videos with sparse annotations.
By exploiting generated data pairs, our framework can recover and even enhance temporal consistency of training sequences.
Results show that our method outperforms the state-of-the-art semisupervised methods by a large margin.
arXiv Detail & Related papers (2020-07-06T02:39:32Z) - A Deep Learning Approach for Motion Forecasting Using 4D OCT Data [69.62333053044712]
We propose 4D-temporal deep learning for end-to-end motion forecasting and estimation using a stream of OCT volumes.
Our best performing 4D method achieves motion forecasting with an overall average correlation of 97.41%, while also improving motion estimation performance by a factor of 2.5 compared to a previous 3D approach.
arXiv Detail & Related papers (2020-04-21T15:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.