Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy
- URL: http://arxiv.org/abs/2409.07723v1
- Date: Thu, 12 Sep 2024 03:04:43 GMT
- Title: Advancing Depth Anything Model for Unsupervised Monocular Depth Estimation in Endoscopy
- Authors: Bojian Li, Bo Liu, Jinghua Yue, Fugen Zhou,
- Abstract summary: We introduce a novel fine-tuning strategy for the Depth Anything Model.
We integrate it with an intrinsic-based unsupervised monocular depth estimation framework.
Our results on the SCARED dataset show that our method achieves state-of-the-art performance.
- Score: 3.1186464715409983
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Depth estimation is a cornerstone of 3D reconstruction and plays a vital role in minimally invasive endoscopic surgeries. However, most current depth estimation networks rely on traditional convolutional neural networks, which are limited in their ability to capture global information. Foundation models offer a promising avenue for enhancing depth estimation, but those currently available are primarily trained on natural images, leading to suboptimal performance when applied to endoscopic images. In this work, we introduce a novel fine-tuning strategy for the Depth Anything Model and integrate it with an intrinsic-based unsupervised monocular depth estimation framework. Our approach includes a low-rank adaptation technique based on random vectors, which improves the model's adaptability to different scales. Additionally, we propose a residual block built on depthwise separable convolution to compensate for the transformer's limited ability to capture high-frequency details, such as edges and textures. Our experimental results on the SCARED dataset show that our method achieves state-of-the-art performance while minimizing the number of trainable parameters. Applying this method in minimally invasive endoscopic surgery could significantly enhance both the precision and safety of these procedures.
Related papers
- Surgical Depth Anything: Depth Estimation for Surgical Scenes using Foundation Models [4.740415113160021]
Current state-of-the-art foundational model for depth estimation, Depth Anything, struggles with issues such as blurring, bleeding, and reflections.
This paper presents a fine-tuning of the Depth Anything model specifically for the surgical domain, aiming to deliver more accurate pixel-wise depth maps.
arXiv Detail & Related papers (2024-10-09T21:06:14Z) - Intraoperative Registration by Cross-Modal Inverse Neural Rendering [61.687068931599846]
We present a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering.
Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively.
We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration.
arXiv Detail & Related papers (2024-09-18T13:40:59Z) - DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model [17.41557655783514]
We introduce Depth Anything in Robotic Endoscopic Surgery (DARES)
New adaptation technique, Low-Rank Adaptation (LoRA) on the DAM V2 to perform self-supervised monocular depth estimation.
New method is validated superior over recent state-of-the-art self-supervised monocular depth estimation techniques.
arXiv Detail & Related papers (2024-08-30T17:35:06Z) - ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation [67.22294293695255]
We propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations.
Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos.
arXiv Detail & Related papers (2024-07-23T14:24:26Z) - Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian [49.21866794516328]
3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis.
Previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting.
We introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates.
arXiv Detail & Related papers (2024-05-30T03:18:30Z) - EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera [12.152362025172915]
We propose Endoscopic Depth Any Camera (EndoDAC) to adapt foundation models to endoscopic scenes.
Specifically, we develop the Dynamic Vector-Based Low-Rank Adaptation (DV-LoRA) and employ Convolutional Neck blocks.
Our framework is capable of being trained solely on monocular surgical videos from any camera, ensuring minimal training costs.
arXiv Detail & Related papers (2024-05-14T14:55:15Z) - High-fidelity Endoscopic Image Synthesis by Utilizing Depth-guided Neural Surfaces [18.948630080040576]
We introduce a novel method for colon section reconstruction by leveraging NeuS applied to endoscopic images, supplemented by a single frame of depth map.
Our approach demonstrates exceptional accuracy in completely rendering colon sections, even capturing unseen portions of the surface.
This breakthrough opens avenues for achieving stable and consistently scaled reconstructions, promising enhanced quality in cancer screening procedures and treatment interventions.
arXiv Detail & Related papers (2024-04-20T18:06:26Z) - AugUndo: Scaling Up Augmentations for Monocular Depth Completion and Estimation [51.143540967290114]
We propose a method that unlocks a wide range of previously-infeasible geometric augmentations for unsupervised depth computation and estimation.
This is achieved by reversing, or undo''-ing, geometric transformations to the coordinates of the output depth, warping the depth map back to the original reference frame.
arXiv Detail & Related papers (2023-10-15T05:15:45Z) - Adversarial Domain Feature Adaptation for Bronchoscopic Depth Estimation [111.89519571205778]
In this work, we propose an alternative domain-adaptive approach to depth estimation.
Our novel two-step structure first trains a depth estimation network with labeled synthetic images in a supervised manner.
The results of our experiments show that the proposed method improves the network's performance on real images by a considerable margin.
arXiv Detail & Related papers (2021-09-24T08:11:34Z) - NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor
Multi-view Stereo [97.07453889070574]
We present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors.
We show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes.
arXiv Detail & Related papers (2021-09-02T17:54:31Z) - Self-Supervised Generative Adversarial Network for Depth Estimation in
Laparoscopic Images [13.996932179049978]
We propose SADepth, a new self-supervised depth estimation method based on Generative Adversarial Networks.
It consists of an encoder-decoder generator and a discriminator to incorporate geometry constraints during training.
Experiments on two public datasets show that SADepth outperforms recent state-of-the-art unsupervised methods by a large margin.
arXiv Detail & Related papers (2021-07-09T19:40:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.