SoftEnNet: Symbiotic Monocular Depth Estimation and Lumen Segmentation
for Colonoscopy Endorobots
- URL: http://arxiv.org/abs/2301.08157v1
- Date: Thu, 19 Jan 2023 16:22:17 GMT
- Title: SoftEnNet: Symbiotic Monocular Depth Estimation and Lumen Segmentation
for Colonoscopy Endorobots
- Authors: Alwyn Mathew, Ludovic Magerand, Emanuele Trucco and Luigi Manfredi
- Abstract summary: Colorectal cancer is the third most common cause of cancer death worldwide.
A vision-based autonomous endorobot can improve colonoscopy procedures significantly.
- Score: 2.9696400288366127
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Colorectal cancer is the third most common cause of cancer death worldwide.
Optical colonoscopy is the gold standard for detecting colorectal cancer;
however, about 25 percent of polyps are missed during the procedure. A
vision-based autonomous endorobot can improve colonoscopy procedures
significantly through systematic, complete screening of the colonic mucosa. The
reliable robot navigation needed requires a three-dimensional understanding of
the environment and lumen tracking to support autonomous tasks. We propose a
novel multi-task model that simultaneously predicts dense depth and lumen
segmentation with an ensemble of deep networks. The depth estimation
sub-network is trained in a self-supervised fashion guided by view synthesis;
the lumen segmentation sub-network is supervised. The two sub-networks are
interconnected with pathways that enable information exchange and thereby
mutual learning. As the lumen is in the image's deepest visual space, lumen
segmentation helps with the depth estimation at the farthest location. In turn,
the estimated depth guides the lumen segmentation network as the lumen location
defines the farthest scene location. Unlike other environments, view synthesis
often fails in the colon because of the deformable wall, textureless surface,
specularities, and wide field of view image distortions, all challenges that
our pipeline addresses. We conducted qualitative analysis on a synthetic
dataset and quantitative analysis on a colon training model and real
colonoscopy videos. The experiments show that our model predicts accurate
scale-invariant depth maps and lumen segmentation from colonoscopy images in
near real-time.
Related papers
- Frontiers in Intelligent Colonoscopy [96.57251132744446]
This study investigates the frontiers of intelligent colonoscopy techniques and their prospective implications for multimodal medical applications.
We assess the current data-centric and model-centric landscapes through four tasks for colonoscopic scene perception.
To embrace the coming multimodal era, we establish three foundational initiatives: a large-scale multimodal instruction tuning dataset ColonINST, a colonoscopy-designed multimodal language model ColonGPT, and a multimodal benchmark.
arXiv Detail & Related papers (2024-10-22T17:57:12Z) - Structure-preserving Image Translation for Depth Estimation in Colonoscopy Video [1.0485739694839669]
We propose a pipeline of structure-preserving synthetic-to-real (sim2real) image translation.
This allows us to generate large quantities of realistic-looking synthetic images for supervised depth estimation.
We also propose a dataset of hand-picked sequences from clinical colonoscopies to improve the image translation process.
arXiv Detail & Related papers (2024-08-19T17:02:16Z) - ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation [67.22294293695255]
We propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations.
Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos.
arXiv Detail & Related papers (2024-07-23T14:24:26Z) - Real-time guidewire tracking and segmentation in intraoperative x-ray [52.51797358201872]
We propose a two-stage deep learning framework for real-time guidewire segmentation and tracking.
In the first stage, a Yolov5 detector is trained, using the original X-ray images as well as synthetic ones, to output the bounding boxes of possible target guidewires.
In the second stage, a novel and efficient network is proposed to segment the guidewire in each detected bounding box.
arXiv Detail & Related papers (2024-04-12T20:39:19Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Multi-task learning with cross-task consistency for improved depth
estimation in colonoscopy [0.2995885872626565]
We develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator.
We demonstrate an improvement of 14.17% on relative error and 10.4% on $delta_1$ accuracy over the most accurate baseline state-of-the-art BTS approach.
arXiv Detail & Related papers (2023-11-30T16:13:17Z) - On the Uncertain Single-View Depths in Endoscopies [12.779570691818753]
Estimating depth from endoscopic images is a pre-requisite for a wide set of AI-assisted technologies.
In this paper, we explore for the first time Bayesian deep networks for single-view depth estimation in colonoscopies.
Our specific contribution is two-fold: 1) an exhaustive analysis of Bayesian deep networks for depth estimation in three different datasets, highlighting challenges and conclusions regarding synthetic-to-real domain changes and supervised vs. self-supervised methods; and 2) a novel teacher-student approach to deep depth learning that takes into account the teacher uncertainty.
arXiv Detail & Related papers (2021-12-16T14:24:17Z) - Deep Learning-based Biological Anatomical Landmark Detection in
Colonoscopy Videos [21.384094148149003]
We propose a novel deep learning-based approach to detect biological anatomical landmarks in colonoscopy videos.
Average detection accuracy reaches 99.75%, while the average IoU of 0.91 shows a high degree of similarity between our predicted landmark periods and ground truth.
arXiv Detail & Related papers (2021-08-06T05:52:32Z) - Generalize Ultrasound Image Segmentation via Instant and Plug & Play
Style Transfer [65.71330448991166]
Deep segmentation models generalize to images with unknown appearance.
Retraining models leads to high latency and complex pipelines.
We propose a novel method for robust segmentation under unknown appearance shifts.
arXiv Detail & Related papers (2021-01-11T05:45:30Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.