Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos
- URL: http://arxiv.org/abs/2403.17915v4
- Date: Tue, 20 Aug 2024 18:17:30 GMT
- Title: Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos
- Authors: Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang, Inbar Fried, Stephen M. Pizer, Marc Niethammer, Roni Sengupta,
- Abstract summary: Monocular depth estimation in endoscopy videos can enable assistive and robotic surgery to obtain better coverage of the organ and detection of various health issues.
Despite promising progress on mainstream, natural image depth estimation, techniques perform poorly on endoscopy images.
In this paper, we utilize the photometric cues, i.e., the light emitted from an endoscope and reflected by the surface, to improve monocular depth estimation.
- Score: 12.497782583094281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Monocular depth estimation in endoscopy videos can enable assistive and robotic surgery to obtain better coverage of the organ and detection of various health issues. Despite promising progress on mainstream, natural image depth estimation, techniques perform poorly on endoscopy images due to a lack of strong geometric features and challenging illumination effects. In this paper, we utilize the photometric cues, i.e., the light emitted from an endoscope and reflected by the surface, to improve monocular depth estimation. We first create two novel loss functions with supervised and self-supervised variants that utilize a per-pixel shading representation. We then propose a novel depth refinement network (PPSNet) that leverages the same per-pixel shading representation. Finally, we introduce teacher-student transfer learning to produce better depth maps from both synthetic data with supervision and clinical data with self-supervision. We achieve state-of-the-art results on the C3VD dataset while estimating high-quality depth maps from clinical data. Our code, pre-trained models, and supplementary materials can be found on our project page: https://ppsnet.github.io/
Related papers
- Uncertainty and Self-Supervision in Single-View Depth [0.8158530638728501]
Single-view depth estimation is an ill-posed problem because there are multiple solutions that explain 3D geometry from a single view.
Deep neural networks have been shown to be effective at capturing depth from a single view, but the majority of current methodologies are deterministic in nature.
We have addressed this problem by quantifying the uncertainty of supervised single-view depth for Bayesian deep neural networks.
arXiv Detail & Related papers (2024-06-20T11:46:17Z) - Robust Depth Enhancement via Polarization Prompt Fusion Tuning [112.88371907047396]
We present a framework that leverages polarization imaging to improve inaccurate depth measurements from various depth sensors.
Our method first adopts a learning-based strategy where a neural network is trained to estimate a dense and complete depth map from polarization data and a sensor depth map from different sensors.
To further improve the performance, we propose a Polarization Prompt Fusion Tuning (PPFT) strategy to effectively utilize RGB-based models pre-trained on large-scale datasets.
arXiv Detail & Related papers (2024-04-05T17:55:33Z) - Multi-task learning with cross-task consistency for improved depth
estimation in colonoscopy [0.2995885872626565]
We develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator.
We demonstrate an improvement of 14.17% on relative error and 10.4% on $delta_1$ accuracy over the most accurate baseline state-of-the-art BTS approach.
arXiv Detail & Related papers (2023-11-30T16:13:17Z) - LightNeuS: Neural Surface Reconstruction in Endoscopy using Illumination
Decline [45.49984459497878]
We propose a new approach to 3D reconstruction from sequences of images acquired by monocular endoscopes.
It is based on two key insights. First, endoluminal cavities are watertight, a property naturally enforced by modeling them in terms of a signed distance function.
Second, the scene illumination is variable. It comes from the endoscope's light sources and decays with the inverse of the squared distance to the surface.
arXiv Detail & Related papers (2023-09-06T06:41:40Z) - A Novel Hybrid Endoscopic Dataset for Evaluating Machine Learning-based
Photometric Image Enhancement Models [0.9236074230806579]
This work introduces a new synthetically generated data-set generated by a generative adversarial techniques.
It also explores both shallow based and deep learning-based image-enhancement methods in overexposed and underexposed lighting conditions.
arXiv Detail & Related papers (2022-07-06T01:47:17Z) - SelfTune: Metrically Scaled Monocular Depth Estimation through
Self-Supervised Learning [53.78813049373321]
We propose a self-supervised learning method for the pre-trained supervised monocular depth networks to enable metrically scaled depth estimation.
Our approach is useful for various applications such as mobile robot navigation and is applicable to diverse environments.
arXiv Detail & Related papers (2022-03-10T12:28:42Z) - Adversarial Domain Feature Adaptation for Bronchoscopic Depth Estimation [111.89519571205778]
In this work, we propose an alternative domain-adaptive approach to depth estimation.
Our novel two-step structure first trains a depth estimation network with labeled synthetic images in a supervised manner.
The results of our experiments show that the proposed method improves the network's performance on real images by a considerable margin.
arXiv Detail & Related papers (2021-09-24T08:11:34Z) - Self-Supervised Generative Adversarial Network for Depth Estimation in
Laparoscopic Images [13.996932179049978]
We propose SADepth, a new self-supervised depth estimation method based on Generative Adversarial Networks.
It consists of an encoder-decoder generator and a discriminator to incorporate geometry constraints during training.
Experiments on two public datasets show that SADepth outperforms recent state-of-the-art unsupervised methods by a large margin.
arXiv Detail & Related papers (2021-07-09T19:40:20Z) - CAMERAS: Enhanced Resolution And Sanity preserving Class Activation
Mapping for image saliency [61.40511574314069]
Backpropagation image saliency aims at explaining model predictions by estimating model-centric importance of individual pixels in the input.
We propose CAMERAS, a technique to compute high-fidelity backpropagation saliency maps without requiring any external priors.
arXiv Detail & Related papers (2021-06-20T08:20:56Z) - Calibrating Self-supervised Monocular Depth Estimation [77.77696851397539]
In the recent years, many methods demonstrated the ability of neural networks to learn depth and pose changes in a sequence of images, using only self-supervision as the training signal.
We show that incorporating prior information about the camera configuration and the environment, we can remove the scale ambiguity and predict depth directly, still using the self-supervised formulation and not relying on any additional sensors.
arXiv Detail & Related papers (2020-09-16T14:35:45Z) - Depth Completion Using a View-constrained Deep Prior [73.21559000917554]
Recent work has shown that the structure of convolutional neural networks (CNNs) induces a strong prior that favors natural images.
This prior, known as a deep image prior (DIP), is an effective regularizer in inverse problems such as image denoising and inpainting.
We extend the concept of the DIP to depth images. Given color images and noisy and incomplete target depth maps, we reconstruct a depth map restored by virtue of using the CNN network structure as a prior.
arXiv Detail & Related papers (2020-01-21T21:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.