ColonAdapter: Geometry Estimation Through Foundation Model Adaptation for Colonoscopy
- URL: http://arxiv.org/abs/2511.22250v1
- Date: Thu, 27 Nov 2025 09:21:11 GMT
- Title: ColonAdapter: Geometry Estimation Through Foundation Model Adaptation for Colonoscopy
- Authors: Zhiyi Jiang, Yifu Wang, Xuelian Cheng, Zongyuan Ge,
- Abstract summary: Estimating 3D geometry from monocular colonoscopy images is challenging due to non-Lambertian surfaces, moving light sources, and large textureless regions.<n>We present ColonAdapter, a self-supervised fine-tuning framework that adapts geometric foundation models for colonoscopy geometry estimation.
- Score: 18.844097623387974
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating 3D geometry from monocular colonoscopy images is challenging due to non-Lambertian surfaces, moving light sources, and large textureless regions. While recent 3D geometric foundation models eliminate the need for multi-stage pipelines, their performance deteriorates in clinical scenes. These models are primarily trained on natural scene datasets and struggle with specularity and homogeneous textures typical in colonoscopy, leading to inaccurate geometry estimation. In this paper, we present ColonAdapter, a self-supervised fine-tuning framework that adapts geometric foundation models for colonoscopy geometry estimation. Our method leverages pretrained geometric priors while tailoring them to clinical data. To improve performance in low-texture regions and ensure scale consistency, we introduce a Detail Restoration Module (DRM) and a geometry consistency loss. Furthermore, a confidence-weighted photometric loss enhances training stability in clinical environments. Experiments on both synthetic and real datasets demonstrate that our approach achieves state-of-the-art performance in camera pose estimation, monocular depth prediction, and dense 3D point map reconstruction, without requiring ground-truth intrinsic parameters.
Related papers
- Moving Light Adaptive Colonoscopy Reconstruction via Illumination-Attenuation-Aware 3D Gaussian Splatting [35.37461816543526]
3D Gaussian Splatting (3DGS) has emerged as a pivotal technique for real-time view synthesis in colonoscopy.<n>However, the vanilla 3DGS assumes static illumination and that observed appearance depends solely on viewing angle.<n>This mismatch forces most 3DGS methods to introduce structure-violating vaporous Gaussian blobs between the camera and tissues.<n>We propose ColIAGS, an improved 3DGS framework tailored for colonoscopy.
arXiv Detail & Related papers (2025-10-21T15:44:23Z) - G4Splat: Geometry-Guided Gaussian Splatting with Generative Prior [53.762256749551284]
We identify accurate geometry as the fundamental prerequisite for effectively exploiting generative models to enhance 3D scene reconstruction.<n>We incorporate this geometry guidance throughout the generative pipeline to improve visibility mask estimation, guide novel view selection, and enhance multi-view consistency when inpainting with video diffusion models.<n>Our method naturally supports single-view inputs and unposed videos, with strong generalizability in both indoor and outdoor scenarios.
arXiv Detail & Related papers (2025-10-14T03:06:28Z) - ColonCrafter: A Depth Estimation Model for Colonoscopy Videos Using Diffusion Priors [1.9437590375121516]
ColonCrafter is a diffusion-based depth estimation model that generates temporally consistent depth maps from monocular colonoscopy videos.<n>Our approach learns robust geometric priors from synthetic colonoscopy sequences to generate temporally consistent depth maps.
arXiv Detail & Related papers (2025-09-16T20:40:22Z) - DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures.
In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process.
In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z) - ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction [50.07671826433922]
It is non-trivial to simultaneously recover meticulous geometry and preserve smoothness across regions with differing characteristics.<n>We propose ND-SDF, which learns a Normal Deflection field to represent the angular deviation between the scene normal and the prior normal.<n>Our method not only obtains smooth weakly textured regions such as walls and floors but also preserves the geometric details of complex structures.
arXiv Detail & Related papers (2024-08-22T17:59:01Z) - ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation [67.22294293695255]
We propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations.
Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos.
arXiv Detail & Related papers (2024-07-23T14:24:26Z) - FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models [67.96827539201071]
We propose a novel test-time optimization approach for 3D scene reconstruction.
Our method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
arXiv Detail & Related papers (2023-08-10T17:55:02Z) - A geometry-aware deep network for depth estimation in monocular
endoscopy [17.425158094539462]
The proposed method is extensively validated across different datasets and clinical images.
The generalizability of the proposed method achieves mean RMSE values of 12.604 (T1-L1), 9.930 (T2-L2), and 13.893 (colon) on the ColonDepth dataset.
arXiv Detail & Related papers (2023-04-20T11:59:32Z) - C$^3$Fusion: Consistent Contrastive Colon Fusion, Towards Deep SLAM in
Colonoscopy [0.0]
3D colon reconstruction from Optical Colonoscopy (OC) to detect non-examined surfaces remains an unsolved problem.
Recent methods demonstrate compelling results, but suffer from: (1) frangible frame-to-frame (or frame-to-model) pose estimation resulting in many tracking failures; or (2) rely on point-based representations at the cost of scan quality.
We propose a novel reconstruction framework that addresses these issues end to end, which result in both quantitatively and qualitatively accurate and robust 3D colon reconstruction.
arXiv Detail & Related papers (2022-06-04T10:38:19Z) - ColDE: A Depth Estimation Framework for Colonoscopy Reconstruction [27.793186578742088]
In this work we have designed a set of training losses to deal with the special challenges of colonoscopy data.
With the training losses powerful enough, our self-supervised framework named ColDE is able to produce better depth maps of colonoscopy data.
arXiv Detail & Related papers (2021-11-19T04:44:27Z) - SIDER: Single-Image Neural Optimization for Facial Geometric Detail
Recovery [54.64663713249079]
SIDER is a novel photometric optimization method that recovers detailed facial geometry from a single image in an unsupervised manner.
In contrast to prior work, SIDER does not rely on any dataset priors and does not require additional supervision from multiple views, lighting changes or ground truth 3D shape.
arXiv Detail & Related papers (2021-08-11T22:34:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.