Related papers: EndoDDC: Learning Sparse to Dense Reconstruction for Endoscopic Robotic Navigation via Diffusion Depth Completion

EndoDDC: Learning Sparse to Dense Reconstruction for Endoscopic Robotic Navigation via Diffusion Depth Completion

URL: http://arxiv.org/abs/2602.21893v2
Date: Thu, 26 Feb 2026 11:19:43 GMT
Title: EndoDDC: Learning Sparse to Dense Reconstruction for Endoscopic Robotic Navigation via Diffusion Depth Completion
Authors: Yinheng Lin, Yiming Huang, Beilei Cui, Long Bai, Huxin Gao, Hongliang Ren, Jiewen Lai,
Abstract summary: We propose EndoDDC, an endoscopy depth completion method that integrates images, sparse depth information with depth gradient features.<n>Our approach outperforms state-of-the-art models in both depth accuracy and robustness.
Score: 15.100363020538852
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Accurate depth estimation plays a critical role in the navigation of endoscopic surgical robots, forming the foundation for 3D reconstruction and safe instrument guidance. Fine-tuning pretrained models heavily relies on endoscopic surgical datasets with precise depth annotations. While existing self-supervised depth estimation techniques eliminate the need for accurate depth annotations, their performance degrades in environments with weak textures and variable lighting, leading to sparse reconstruction with invalid depth estimation. Depth completion using sparse depth maps can mitigate these issues and improve accuracy. Despite the advances in depth completion techniques in general fields, their application in endoscopy remains limited. To overcome these limitations, we propose EndoDDC, an endoscopy depth completion method that integrates images, sparse depth information with depth gradient features, and optimizes depth maps through a diffusion model, addressing the issues of weak texture and light reflection in endoscopic environments. Extensive experiments on two publicly available endoscopy datasets show that our approach outperforms state-of-the-art models in both depth accuracy and robustness. This demonstrates the potential of our method to reduce visual errors in complex endoscopic environments. Our code will be released at https://github.com/yinheng-lin/EndoDDC.

Related papers

Monocular absolute depth estimation from endoscopy via domain-invariant feature learning and latent consistency [5.257607423828341]
Current image-level unsupervised domain adaptation methods translate synthetic images with known depth maps into the style of real endoscopic frames.<n>We present a latent feature alignment method to improve absolute depth estimation by reducing this domain gap in the context of endoscopic videos of the central airway.<n>Compared to state-of-the-art MDE methods, our approach achieves superior performance on both absolute and relative depth metrics.
arXiv Detail & Related papers (2025-11-04T04:25:15Z)
Occlusion-Aware Self-Supervised Monocular Depth Estimation for Weak-Texture Endoscopic Images [1.1084686909647639]
We propose a self-supervised monocular depth estimation network tailored for endoscopic scenes.<n>Existing methods, though accurate, typically assume consistent illumination.<n>These variations lead to incorrect geometric interpretations and unreliable self-supervised signals.
arXiv Detail & Related papers (2025-04-24T14:12:57Z)
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation [108.04354143020886]
We introduce prompting into depth foundation models, creating a new paradigm for metric depth estimation termed Prompt Depth Anything.<n>We use a low-cost LiDAR as the prompt to guide the Depth Anything model for accurate metric depth output, achieving up to 4K resolution.
arXiv Detail & Related papers (2024-12-18T16:32:12Z)
Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion [57.08169927189237]
Existing methods for depth completion operate in tightly constrained settings.<n>Inspired by advances in monocular depth estimation, we reframe depth completion as an image-conditional depth map generation.<n>Marigold-DC builds on a pretrained latent diffusion model for monocular depth estimation and injects the depth observations as test-time guidance.
arXiv Detail & Related papers (2024-12-18T00:06:41Z)
Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy [2.906891207990726]
We introduce a novel fine-tuning strategy for the Depth Anything Model.<n>We integrate it with an intrinsic-based unsupervised monocular depth estimation framework.<n>Our results show that our method achieves state-of-the-art performance while minimizing the number of trainable parameters.
arXiv Detail & Related papers (2024-09-12T03:04:43Z)
ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation [67.22294293695255]
We propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations. Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos.
arXiv Detail & Related papers (2024-07-23T14:24:26Z)
Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting [12.333523732756163]
Dynamic scene reconstruction can significantly enhance downstream tasks and improve surgical outcomes. NeRF-based methods have recently risen to prominence for their exceptional ability to reconstruct scenes. We present Endo-4DGS, a real-time endoscopic dynamic reconstruction approach.
arXiv Detail & Related papers (2024-01-29T18:55:29Z)
AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation. This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z)
Mind The Edge: Refining Depth Edges in Sparsely-Supervised Monocular Depth Estimation [42.19770683222846]
Monocular Depth Estimation (MDE) is a fundamental problem in computer vision with numerous applications. In this paper we propose to learn to detect the location of depth edges from densely-supervised synthetic data. We demonstrate significant gains in the accuracy of the depth edges with comparable per-pixel depth accuracy on several challenging datasets.
arXiv Detail & Related papers (2022-12-10T14:49:24Z)
Self-Supervised Generative Adversarial Network for Depth Estimation in Laparoscopic Images [13.996932179049978]
We propose SADepth, a new self-supervised depth estimation method based on Generative Adversarial Networks. It consists of an encoder-decoder generator and a discriminator to incorporate geometry constraints during training. Experiments on two public datasets show that SADepth outperforms recent state-of-the-art unsupervised methods by a large margin.
arXiv Detail & Related papers (2021-07-09T19:40:20Z)
SelfDeco: Self-Supervised Monocular Depth Completion in Challenging Indoor Environments [50.761917113239996]
We present a novel algorithm for self-supervised monocular depth completion. Our approach is based on training a neural network that requires only sparse depth measurements and corresponding monocular video sequences without dense depth labels. Our self-supervised algorithm is designed for challenging indoor environments with textureless regions, glossy and transparent surface, non-Lambertian surfaces, moving people, longer and diverse depth ranges and scenes captured by complex ego-motions.
arXiv Detail & Related papers (2020-11-10T08:55:07Z)
Occlusion-Aware Depth Estimation with Adaptive Normal Constraints [85.44842683936471]
We present a new learning-based method for multi-frame depth estimation from a color video. Our method outperforms the state-of-the-art in terms of depth estimation accuracy.
arXiv Detail & Related papers (2020-04-02T07:10:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.