Tracking monocular camera pose and deformation for SLAM inside the human
body
- URL: http://arxiv.org/abs/2204.08309v1
- Date: Mon, 18 Apr 2022 13:25:23 GMT
- Title: Tracking monocular camera pose and deformation for SLAM inside the human
body
- Authors: Juan J. Gomez Rodriguez, J.M.M Montiel and Juan D. Tardos
- Abstract summary: We propose a novel method to simultaneously track the camera pose and the 3D scene deformation.
The method uses an illumination-invariant photometric method to track image features and estimates camera motion and deformation.
Our results in simulated colonoscopies show the method's accuracy and robustness in complex scenes under increasing levels of deformation.
- Score: 2.094821665776961
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Monocular SLAM in deformable scenes will open the way to multiple medical
applications like computer-assisted navigation in endoscopy, automatic drug
delivery or autonomous robotic surgery. In this paper we propose a novel method
to simultaneously track the camera pose and the 3D scene deformation, without
any assumption about environment topology or shape. The method uses an
illumination-invariant photometric method to track image features and estimates
camera motion and deformation combining reprojection error with spatial and
temporal regularization of deformations. Our results in simulated colonoscopies
show the method's accuracy and robustness in complex scenes under increasing
levels of deformation. Our qualitative results in human colonoscopies from
Endomapper dataset show that the method is able to successfully cope with the
challenges of real endoscopies: deformations, low texture and strong
illumination changes. We also compare with previous tracking methods in simpler
scenarios from Hamlyn dataset where we obtain competitive performance, without
needing any topological assumption.
Related papers
- Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - FLex: Joint Pose and Dynamic Radiance Fields Optimization for Stereo Endoscopic Videos [79.50191812646125]
Reconstruction of endoscopic scenes is an important asset for various medical applications, from post-surgery analysis to educational training.
We adress the challenging setup of a moving endoscope within a highly dynamic environment of deforming tissue.
We propose an implicit scene separation into multiple overlapping 4D neural radiance fields (NeRFs) and a progressive optimization scheme jointly optimizing for reconstruction and camera poses from scratch.
This improves the ease-of-use and allows to scale reconstruction capabilities in time to process surgical videos of 5,000 frames and more; an improvement of more than ten times compared to the state of the art while being agnostic to external tracking information
arXiv Detail & Related papers (2024-03-18T19:13:02Z) - Decaf: Monocular Deformation Capture for Face and Hand Interactions [77.75726740605748]
This paper introduces the first method that allows tracking human hands interacting with human faces in 3D from single monocular RGB videos.
We model hands as articulated objects inducing non-rigid face deformations during an active interaction.
Our method relies on a new hand-face motion and interaction capture dataset with realistic face deformations acquired with a markerless multi-view camera system.
arXiv Detail & Related papers (2023-09-28T17:59:51Z) - The Drunkard's Odometry: Estimating Camera Motion in Deforming Scenes [79.00228778543553]
This dataset is the first large set of exploratory camera trajectories with ground truth inside 3D scenes.
Simulations in realistic 3D buildings lets us obtain a vast amount of data and ground truth labels.
We present a novel deformable odometry method, dubbed the Drunkard's Odometry, which decomposes optical flow estimates into rigid-body camera motion.
arXiv Detail & Related papers (2023-06-29T13:09:31Z) - 3D shape reconstruction of semi-transparent worms [0.950214811819847]
3D shape reconstruction typically requires identifying object features or textures in multiple images of a subject.
Here we overcome these challenges by rendering a candidate shape with adaptive blurring and transparency for comparison with the images.
We model the slender Caenorhabditis elegans as a 3D curve using an intrinsic parametrisation that naturally admits biologically-informed constraints and regularisation.
arXiv Detail & Related papers (2023-04-28T13:29:36Z) - Learning How To Robustly Estimate Camera Pose in Endoscopic Videos [5.073761189475753]
We propose a solution for stereo endoscopes that estimates depth and optical flow to minimize two geometric losses for camera pose estimation.
Most importantly, we introduce two learned adaptive per-pixel weight mappings that balance contributions according to the input image content.
We validate our approach on the publicly available SCARED dataset and introduce a new in-vivo dataset, StereoMIS.
arXiv Detail & Related papers (2023-04-17T07:05:01Z) - Neural 3D Reconstruction in the Wild [86.6264706256377]
We introduce a new method that enables efficient and accurate surface reconstruction from Internet photo collections.
We present a new benchmark and protocol for evaluating reconstruction performance on such in-the-wild scenes.
arXiv Detail & Related papers (2022-05-25T17:59:53Z) - Bimodal Camera Pose Prediction for Endoscopy [23.12495584329767]
We propose SimCol, a synthetic dataset for camera pose estimation in colonoscopy.
Our dataset replicates real colonoscope motion and highlights the drawbacks of existing methods.
We publish 18k RGB images from simulated colonoscopy with corresponding depth and camera poses and make our data generation environment in Unity publicly available.
arXiv Detail & Related papers (2022-04-11T09:34:34Z) - A Temporal Learning Approach to Inpainting Endoscopic Specularities and
Its effect on Image Correspondence [13.25903945009516]
We propose using a temporal generative adversarial network (GAN) to inpaint the hidden anatomy under specularities.
This is achieved using in-vivo data of gastric endoscopy (Hyper-Kvasir) in a fully unsupervised manner.
We also assess the effect of our method in computer vision tasks that underpin 3D reconstruction and camera motion estimation.
arXiv Detail & Related papers (2022-03-31T13:14:00Z) - Direct and Sparse Deformable Tracking [4.874780144224057]
We introduce a novel deformable camera tracking method with a local deformation model for each point.
Thanks to a direct photometric error cost function, we can track the position and orientation of the surfel without an explicit global deformation model.
arXiv Detail & Related papers (2021-09-15T15:28:10Z) - A parameter refinement method for Ptychography based on Deep Learning
concepts [55.41644538483948]
coarse parametrisation in propagation distance, position errors and partial coherence frequently menaces the experiment viability.
A modern Deep Learning framework is used to correct autonomously the setup incoherences, thus improving the quality of a ptychography reconstruction.
We tested our system on both synthetic datasets and also on real data acquired at the TwinMic beamline of the Elettra synchrotron facility.
arXiv Detail & Related papers (2021-05-18T10:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.