Segmentation of kidney stones in endoscopic video feeds
- URL: http://arxiv.org/abs/2204.14175v1
- Date: Fri, 29 Apr 2022 16:00:52 GMT
- Title: Segmentation of kidney stones in endoscopic video feeds
- Authors: Zachary A Stoebner, Daiwei Lu, Seok Hee Hong, Nicholas L Kavoussi,
Ipek Oguz
- Abstract summary: We describe how we built a dataset from the raw videos and how we developed a pipeline to automate as much of the process as possible.
To show clinical potential for real-time use, we also confirmed that our best trained model can accurately annotate new videos at 30 frames per second.
- Score: 2.572404739180802
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image segmentation has been increasingly applied in medical settings as
recent developments have skyrocketed the potential applications of deep
learning. Urology, specifically, is one field of medicine that is primed for
the adoption of a real-time image segmentation system with the long-term aim of
automating endoscopic stone treatment. In this project, we explored supervised
deep learning models to annotate kidney stones in surgical endoscopic video
feeds. In this paper, we describe how we built a dataset from the raw videos
and how we developed a pipeline to automate as much of the process as possible.
For the segmentation task, we adapted and analyzed three baseline deep learning
models -- U-Net, U-Net++, and DenseNet -- to predict annotations on the frames
of the endoscopic videos with the highest accuracy above 90\%. To show clinical
potential for real-time use, we also confirmed that our best trained model can
accurately annotate new videos at 30 frames per second. Our results demonstrate
that the proposed method justifies continued development and study of image
segmentation to annotate ureteroscopic video feeds.
Related papers
- Vision-Based Neurosurgical Guidance: Unsupervised Localization and Camera-Pose Prediction [41.91807060434709]
Localizing oneself during endoscopic procedures can be problematic due to the lack of distinguishable textures and landmarks.
We present a deep learning method based on anatomy recognition, that constructs a surgical path in an unsupervised manner from surgical videos.
arXiv Detail & Related papers (2024-05-15T14:09:11Z) - Endora: Video Generation Models as Endoscopy Simulators [53.72175969751398]
This paper introduces model, an innovative approach to generate medical videos that simulate clinical endoscopy scenes.
We also pioneer the first public benchmark for endoscopy simulation with video generation models.
Endora marks a notable breakthrough in the deployment of generative AI for clinical endoscopy research.
arXiv Detail & Related papers (2024-03-17T00:51:59Z) - AutoLaparo: A New Dataset of Integrated Multi-tasks for Image-guided
Surgical Automation in Laparoscopic Hysterectomy [42.20922574566824]
We present and release the first integrated dataset with multiple image-based perception tasks to facilitate learning-based automation in hysterectomy surgery.
Our AutoLaparo dataset is developed based on full-length videos of entire hysterectomy procedures.
Specifically, three different yet highly correlated tasks are formulated in the dataset, including surgical workflow recognition, laparoscope motion prediction, and instrument and key anatomy segmentation.
arXiv Detail & Related papers (2022-08-03T13:17:23Z) - Cascaded Robust Learning at Imperfect Labels for Chest X-ray
Segmentation [61.09321488002978]
We present a novel cascaded robust learning framework for chest X-ray segmentation with imperfect annotation.
Our model consists of three independent network, which can effectively learn useful information from the peer networks.
Our methods could achieve a significant improvement on the accuracy in segmentation tasks compared to the previous methods.
arXiv Detail & Related papers (2021-04-05T15:50:16Z) - Inter-slice Context Residual Learning for 3D Medical Image Segmentation [38.43650000401734]
We propose the 3D context residual network (ConResNet) for the accurate segmentation of 3D medical images.
This model consists of an encoder, a segmentation decoder, and a context residual decoder.
We show that the proposed ConResNet is more accurate than six top-ranking methods in brain tumor segmentation and seven top-ranking methods in pancreas segmentation.
arXiv Detail & Related papers (2020-11-28T16:03:39Z) - Multi-frame Feature Aggregation for Real-time Instrument Segmentation in
Endoscopic Video [11.100734994959419]
We propose a novel Multi-frame Feature Aggregation (MFFA) module to aggregate video frame features temporally and spatially.
We also develop a method that can randomly synthesize a surgical frame sequence from a single labeled frame to assist network training.
arXiv Detail & Related papers (2020-11-17T16:27:27Z) - Weakly-supervised Learning For Catheter Segmentation in 3D Frustum
Ultrasound [74.22397862400177]
We propose a novel Frustum ultrasound based catheter segmentation method.
The proposed method achieved the state-of-the-art performance with an efficiency of 0.25 second per volume.
arXiv Detail & Related papers (2020-10-19T13:56:22Z) - Towards Unsupervised Learning for Instrument Segmentation in Robotic
Surgery with Cycle-Consistent Adversarial Networks [54.00217496410142]
We propose an unpaired image-to-image translation where the goal is to learn the mapping between an input endoscopic image and a corresponding annotation.
Our approach allows to train image segmentation models without the need to acquire expensive annotations.
We test our proposed method on Endovis 2017 challenge dataset and show that it is competitive with supervised segmentation methods.
arXiv Detail & Related papers (2020-07-09T01:39:39Z) - LRTD: Long-Range Temporal Dependency based Active Learning for Surgical
Workflow Recognition [67.86810761677403]
We propose a novel active learning method for cost-effective surgical video analysis.
Specifically, we propose a non-local recurrent convolutional network (NL-RCNet), which introduces non-local block to capture the long-range temporal dependency.
We validate our approach on a large surgical video dataset (Cholec80) by performing surgical workflow recognition task.
arXiv Detail & Related papers (2020-04-21T09:21:22Z) - Self-supervised Representation Learning for Ultrasound Video [18.515314344284445]
We propose a self-supervised learning approach to learn meaningful and transferable representations from medical imaging video.
We force the model to address anatomy-aware tasks with free supervision from the data itself.
Experiments on fetal ultrasound video show that the proposed approach can effectively learn meaningful and strong representations.
arXiv Detail & Related papers (2020-02-28T23:00:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.