Multi-task learning with cross-task consistency for improved depth
estimation in colonoscopy
- URL: http://arxiv.org/abs/2311.18664v1
- Date: Thu, 30 Nov 2023 16:13:17 GMT
- Title: Multi-task learning with cross-task consistency for improved depth
estimation in colonoscopy
- Authors: Pedro Esteban Chavarrias Solano, Andrew Bulpitt, Venkataraman
Subramanian, Sharib Ali
- Abstract summary: We develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator.
We demonstrate an improvement of 14.17% on relative error and 10.4% on $delta_1$ accuracy over the most accurate baseline state-of-the-art BTS approach.
- Score: 0.2995885872626565
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Colonoscopy screening is the gold standard procedure for assessing
abnormalities in the colon and rectum, such as ulcers and cancerous polyps.
Measuring the abnormal mucosal area and its 3D reconstruction can help quantify
the surveyed area and objectively evaluate disease burden. However, due to the
complex topology of these organs and variable physical conditions, for example,
lighting, large homogeneous texture, and image modality estimating distance
from the camera aka depth) is highly challenging. Moreover, most colonoscopic
video acquisition is monocular, making the depth estimation a non-trivial
problem. While methods in computer vision for depth estimation have been
proposed and advanced on natural scene datasets, the efficacy of these
techniques has not been widely quantified on colonoscopy datasets. As the
colonic mucosa has several low-texture regions that are not well pronounced,
learning representations from an auxiliary task can improve salient feature
extraction, allowing estimation of accurate camera depths. In this work, we
propose to develop a novel multi-task learning (MTL) approach with a shared
encoder and two decoders, namely a surface normal decoder and a depth estimator
decoder. Our depth estimator incorporates attention mechanisms to enhance
global context awareness. We leverage the surface normal prediction to improve
geometric feature extraction. Also, we apply a cross-task consistency loss
among the two geometrically related tasks, surface normal and camera depth. We
demonstrate an improvement of 14.17% on relative error and 10.4% improvement on
$\delta_{1}$ accuracy over the most accurate baseline state-of-the-art BTS
approach. All experiments are conducted on a recently released C3VD dataset;
thus, we provide a first benchmark of state-of-the-art methods.
Related papers
- ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation [67.22294293695255]
We propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations.
Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos.
arXiv Detail & Related papers (2024-07-23T14:24:26Z) - High-fidelity Endoscopic Image Synthesis by Utilizing Depth-guided Neural Surfaces [18.948630080040576]
We introduce a novel method for colon section reconstruction by leveraging NeuS applied to endoscopic images, supplemented by a single frame of depth map.
Our approach demonstrates exceptional accuracy in completely rendering colon sections, even capturing unseen portions of the surface.
This breakthrough opens avenues for achieving stable and consistently scaled reconstructions, promising enhanced quality in cancer screening procedures and treatment interventions.
arXiv Detail & Related papers (2024-04-20T18:06:26Z) - Unsupervised deep learning techniques for powdery mildew recognition
based on multispectral imaging [63.62764375279861]
This paper presents a deep learning approach to automatically recognize powdery mildew on cucumber leaves.
We focus on unsupervised deep learning techniques applied to multispectral imaging data.
We propose the use of autoencoder architectures to investigate two strategies for disease detection.
arXiv Detail & Related papers (2021-12-20T13:29:13Z) - On the Uncertain Single-View Depths in Endoscopies [12.779570691818753]
Estimating depth from endoscopic images is a pre-requisite for a wide set of AI-assisted technologies.
In this paper, we explore for the first time Bayesian deep networks for single-view depth estimation in colonoscopies.
Our specific contribution is two-fold: 1) an exhaustive analysis of Bayesian deep networks for depth estimation in three different datasets, highlighting challenges and conclusions regarding synthetic-to-real domain changes and supervised vs. self-supervised methods; and 2) a novel teacher-student approach to deep depth learning that takes into account the teacher uncertainty.
arXiv Detail & Related papers (2021-12-16T14:24:17Z) - ColDE: A Depth Estimation Framework for Colonoscopy Reconstruction [27.793186578742088]
In this work we have designed a set of training losses to deal with the special challenges of colonoscopy data.
With the training losses powerful enough, our self-supervised framework named ColDE is able to produce better depth maps of colonoscopy data.
arXiv Detail & Related papers (2021-11-19T04:44:27Z) - Probabilistic and Geometric Depth: Detecting Objects in Perspective [78.00922683083776]
3D object detection is an important capability needed in various practical applications such as driver assistance systems.
Monocular 3D detection, as an economical solution compared to conventional settings relying on binocular vision or LiDAR, has drawn increasing attention recently but still yields unsatisfactory results.
This paper first presents a systematic study on this problem and observes that the current monocular 3D detection problem can be simplified as an instance depth estimation problem.
arXiv Detail & Related papers (2021-07-29T16:30:33Z) - Self-Supervised Generative Adversarial Network for Depth Estimation in
Laparoscopic Images [13.996932179049978]
We propose SADepth, a new self-supervised depth estimation method based on Generative Adversarial Networks.
It consists of an encoder-decoder generator and a discriminator to incorporate geometry constraints during training.
Experiments on two public datasets show that SADepth outperforms recent state-of-the-art unsupervised methods by a large margin.
arXiv Detail & Related papers (2021-07-09T19:40:20Z) - Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust
Depth Prediction [87.08227378010874]
We show the importance of the high-order 3D geometric constraints for depth prediction.
By designing a loss term that enforces a simple geometric constraint, we significantly improve the accuracy and robustness of monocular depth estimation.
We show state-of-the-art results of learning metric depth on NYU Depth-V2 and KITTI.
arXiv Detail & Related papers (2021-03-07T00:08:21Z) - Robust Consistent Video Depth Estimation [65.53308117778361]
We present an algorithm for estimating consistent dense depth maps and camera poses from a monocular video.
Our algorithm combines two complementary techniques: (1) flexible deformation-splines for low-frequency large-scale alignment and (2) geometry-aware depth filtering for high-frequency alignment of fine depth details.
In contrast to prior approaches, our method does not require camera poses as input and achieves robust reconstruction for challenging hand-held cell phone captures containing a significant amount of noise, shake, motion blur, and rolling shutter deformations.
arXiv Detail & Related papers (2020-12-10T18:59:48Z) - Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video.
Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.