SAGE: SLAM with Appearance and Geometry Prior for Endoscopy
- URL: http://arxiv.org/abs/2202.09487v2
- Date: Tue, 22 Feb 2022 18:24:03 GMT
- Title: SAGE: SLAM with Appearance and Geometry Prior for Endoscopy
- Authors: Xingtong Liu, Zhaoshuo Li, Masaru Ishii, Gregory D. Hager, Russell H.
Taylor, Mathias Unberath
- Abstract summary: In endoscopy, many applications would benefit from a real-time method that can simultaneously track the endoscope and reconstruct the dense 3D geometry of the observed anatomy from a monocular endoscopic video.
We develop a Simultaneous Localization and Mapping system by combining the learning-based appearance and optimizable geometry priors and factor graph optimization.
The proposed SLAM system is shown to robustly handle the challenges of texture scarceness and illumination variation that are commonly seen in endoscopy.
- Score: 24.94746710994156
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In endoscopy, many applications (e.g., surgical navigation) would benefit
from a real-time method that can simultaneously track the endoscope and
reconstruct the dense 3D geometry of the observed anatomy from a monocular
endoscopic video. To this end, we develop a Simultaneous Localization and
Mapping system by combining the learning-based appearance and optimizable
geometry priors and factor graph optimization. The appearance and geometry
priors are explicitly learned in an end-to-end differentiable training pipeline
to master the task of pair-wise image alignment, one of the core components of
the SLAM system. In our experiments, the proposed SLAM system is shown to
robustly handle the challenges of texture scarceness and illumination variation
that are commonly seen in endoscopy. The system generalizes well to unseen
endoscopes and subjects and performs favorably compared with a state-of-the-art
feature-based SLAM system. The code repository is available at
https://github.com/lppllppl920/SAGE-SLAM.git.
Related papers
- BodySLAM: A Generalized Monocular Visual SLAM Framework for Surgical Applications [0.0]
This study presents BodySLAM, a robust deep learning-based MVSLAM approach that addresses these challenges through three key components.
CycleVO is a novel unsupervised monocular pose estimation module; the integration of the state-of-the-art Zoe architecture for monocular depth estimation; and a 3D reconstruction module creating a coherent surgical map.
Results demonstrate that CycleVO exhibited competitive performance with the lowest inference time among pose estimation methods, while maintaining robust generalization capabilities.
arXiv Detail & Related papers (2024-08-06T10:13:57Z) - MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization [29.713650915551632]
This letter introduces a novel framework for dense Visual Simultaneous Localization and Mapping based on Gaussian Splatting.
We jointly optimize sparse visual odometry tracking and 3D Gaussian Splatting scene representation for the first time.
The accuracy of our pose estimation surpasses existing methods and state-of-the-art.
arXiv Detail & Related papers (2024-05-10T04:42:21Z) - EndoGSLAM: Real-Time Dense Reconstruction and Tracking in Endoscopic Surgeries using Gaussian Splatting [53.38166294158047]
EndoGSLAM is an efficient approach for endoscopic surgeries, which integrates streamlined representation and differentiable Gaussianization.
Experiments show that EndoGSLAM achieves a better trade-off between intraoperative availability and reconstruction quality than traditional or neural SLAM approaches.
arXiv Detail & Related papers (2024-03-22T11:27:43Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - A geometry-aware deep network for depth estimation in monocular
endoscopy [17.425158094539462]
The proposed method is extensively validated across different datasets and clinical images.
The generalizability of the proposed method achieves mean RMSE values of 12.604 (T1-L1), 9.930 (T2-L2), and 13.893 (colon) on the ColonDepth dataset.
arXiv Detail & Related papers (2023-04-20T11:59:32Z) - Multimodal Semantic Scene Graphs for Holistic Modeling of Surgical
Procedures [70.69948035469467]
We take advantage of the latest computer vision methodologies for generating 3D graphs from camera views.
We then introduce the Multimodal Semantic Graph Scene (MSSG) which aims at providing unified symbolic and semantic representation of surgical procedures.
arXiv Detail & Related papers (2021-06-09T14:35:44Z) - Single-shot Hyperspectral-Depth Imaging with Learned Diffractive Optics [72.9038524082252]
We propose a compact single-shot monocular hyperspectral-depth (HS-D) imaging method.
Our method uses a diffractive optical element (DOE), the point spread function of which changes with respect to both depth and spectrum.
To facilitate learning the DOE, we present a first HS-D dataset by building a benchtop HS-D imager.
arXiv Detail & Related papers (2020-09-01T14:19:35Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z) - Redesigning SLAM for Arbitrary Multi-Camera Systems [51.81798192085111]
Adding more cameras to SLAM systems improves robustness and accuracy but complicates the design of the visual front-end significantly.
In this work, we aim at an adaptive SLAM system that works for arbitrary multi-camera setups.
We adapt a state-of-the-art visual-inertial odometry with these modifications, and experimental results show that the modified pipeline can adapt to a wide range of camera setups.
arXiv Detail & Related papers (2020-03-04T11:44:42Z) - DeepFactors: Real-Time Probabilistic Dense Monocular SLAM [29.033778410908877]
We present a SLAM system that unifies methods in a probabilistic framework while still maintaining real-time performance.
This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors.
We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry.
arXiv Detail & Related papers (2020-01-14T21:08:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.