Bimodal Camera Pose Prediction for Endoscopy
- URL: http://arxiv.org/abs/2204.04968v2
- Date: Fri, 15 Dec 2023 16:08:46 GMT
- Title: Bimodal Camera Pose Prediction for Endoscopy
- Authors: Anita Rau, Binod Bhattarai, Lourdes Agapito, Danail Stoyanov
- Abstract summary: We propose SimCol, a synthetic dataset for camera pose estimation in colonoscopy.
Our dataset replicates real colonoscope motion and highlights the drawbacks of existing methods.
We publish 18k RGB images from simulated colonoscopy with corresponding depth and camera poses and make our data generation environment in Unity publicly available.
- Score: 23.12495584329767
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deducing the 3D structure of endoscopic scenes from images is exceedingly
challenging. In addition to deformation and view-dependent lighting, tubular
structures like the colon present problems stemming from their self-occluding
and repetitive anatomical structure. In this paper, we propose SimCol, a
synthetic dataset for camera pose estimation in colonoscopy, and a novel method
that explicitly learns a bimodal distribution to predict the endoscope pose.
Our dataset replicates real colonoscope motion and highlights the drawbacks of
existing methods. We publish 18k RGB images from simulated colonoscopy with
corresponding depth and camera poses and make our data generation environment
in Unity publicly available. We evaluate different camera pose prediction
methods and demonstrate that, when trained on our data, they generalize to real
colonoscopy sequences, and our bimodal approach outperforms prior unimodal
work.
Related papers
- ToDER: Towards Colonoscopy Depth Estimation and Reconstruction with Geometry Constraint Adaptation [67.22294293695255]
We propose a novel reconstruction pipeline with a bi-directional adaptation architecture named ToDER to get precise depth estimations.
Experimental results demonstrate that our approach can precisely predict depth maps in both realistic and synthetic colonoscopy videos.
arXiv Detail & Related papers (2024-07-23T14:24:26Z) - EndoSparse: Real-Time Sparse View Synthesis of Endoscopic Scenes using Gaussian Splatting [39.60431471170721]
3D reconstruction of biological tissues from a collection of endoscopic images is a key to unlock various important downstream surgical applications with 3D capabilities.
Existing methods employ various advanced neural rendering techniques for view synthesis, but they often struggle to recover accurate 3D representations when only sparse observations are available.
We propose a framework leveraging the prior knowledge from multiple foundation models during the reconstruction process, dubbed as textitEndoSparse.
arXiv Detail & Related papers (2024-07-01T07:24:09Z) - FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos [79.50191812646125]
Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training.
We adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue.
We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch.
This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information
arXiv Detail & Related papers (2024-03-18T19:13:02Z) - Endora: Video Generation Models as Endoscopy Simulators [53.72175969751398]
This paper introduces model, an innovative approach to generate medical videos that simulate clinical endoscopy scenes.
We also pioneer the first public benchmark for endoscopy simulation with video generation models.
Endora marks a notable breakthrough in the deployment of generative AI for clinical endoscopy research.
arXiv Detail & Related papers (2024-03-17T00:51:59Z) - Cameras as Rays: Pose Estimation via Ray Diffusion [54.098613859015856]
Estimating camera poses is a fundamental task for 3D reconstruction and remains challenging given sparsely sampled views.
We propose a distributed representation of camera pose that treats a camera as a bundle of rays.
Our proposed methods, both regression- and diffusion-based, demonstrate state-of-the-art performance on camera pose estimation on CO3D.
arXiv Detail & Related papers (2024-02-22T18:59:56Z) - LightNeuS: Neural Surface Reconstruction in Endoscopy using Illumination
Decline [45.49984459497878]
We propose a new approach to 3D reconstruction from sequences of images acquired by monocular endoscopes.
It is based on two key insights. First, endoluminal cavities are watertight, a property naturally enforced by modeling them in terms of a signed distance function.
Second, the scene illumination is variable. It comes from the endoscope's light sources and decays with the inverse of the squared distance to the surface.
arXiv Detail & Related papers (2023-09-06T06:41:40Z) - SimCol3D -- 3D Reconstruction during Colonoscopy Challenge [31.01817462784811]
The 2022 EndoVis sub-challenge SimCol3D aimed to facilitate data-driven depth and pose prediction during colonoscopy.
We show that depth prediction from synthetic colonoscopy images is robustly solvable, while pose estimation remains an open research question.
arXiv Detail & Related papers (2023-07-20T22:41:23Z) - SoftEnNet: Symbiotic Monocular Depth Estimation and Lumen Segmentation
for Colonoscopy Endorobots [2.9696400288366127]
Colorectal cancer is the third most common cause of cancer death worldwide.
A vision-based autonomous endorobot can improve colonoscopy procedures significantly.
arXiv Detail & Related papers (2023-01-19T16:22:17Z) - Photometric single-view dense 3D reconstruction in endoscopy [2.094821665776961]
We exploit the controlled lighting in colonoscopy to achieve the first in-vivo 3D reconstruction of the human colon using photometric stereo on a calibrated monocular endoscope.
Our method works in a real medical environment, providing both a suitable in-place calibration procedure and a depth estimation technique adapted to the colon's tubular geometry.
arXiv Detail & Related papers (2022-04-19T18:23:31Z) - Tracking monocular camera pose and deformation for SLAM inside the human
body [2.094821665776961]
We propose a novel method to simultaneously track the camera pose and the 3D scene deformation.
The method uses an illumination-invariant photometric method to track image features and estimates camera motion and deformation.
Our results in simulated colonoscopies show the method's accuracy and robustness in complex scenes under increasing levels of deformation.
arXiv Detail & Related papers (2022-04-18T13:25:23Z) - Colonoscopy Polyp Detection: Domain Adaptation From Medical Report
Images to Real-time Videos [76.37907640271806]
We propose an Image-video-joint polyp detection network (Ivy-Net) to address the domain gap between colonoscopy images from historical medical reports and real-time videos.
Experiments on the collected dataset demonstrate that our Ivy-Net achieves the state-of-the-art result on colonoscopy video.
arXiv Detail & Related papers (2020-12-31T10:33:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.