SimpleMapping: Real-Time Visual-Inertial Dense Mapping with Deep
Multi-View Stereo
- URL: http://arxiv.org/abs/2306.08648v3
- Date: Sun, 27 Aug 2023 11:52:50 GMT
- Title: SimpleMapping: Real-Time Visual-Inertial Dense Mapping with Deep
Multi-View Stereo
- Authors: Yingye Xin, Xingxing Zuo, Dongyue Lu, Stefan Leutenegger
- Abstract summary: We present a real-time visual-inertial dense mapping method with high quality using only monocular images and IMU readings.
We propose a sparse point aided stereo neural network (SPA-MVSNet) that can effectively leverage the informative but noisy sparse points from the VIO system.
Our proposed dense mapping system achieves a 39.7% improvement in F-score over existing systems when evaluated on the challenging scenarios of the EuRoC dataset.
- Score: 13.535871843518953
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a real-time visual-inertial dense mapping method capable of
performing incremental 3D mesh reconstruction with high quality using only
sequential monocular images and inertial measurement unit (IMU) readings. 6-DoF
camera poses are estimated by a robust feature-based visual-inertial odometry
(VIO), which also generates noisy sparse 3D map points as a by-product. We
propose a sparse point aided multi-view stereo neural network (SPA-MVSNet) that
can effectively leverage the informative but noisy sparse points from the VIO
system. The sparse depth from VIO is firstly completed by a single-view depth
completion network. This dense depth map, although naturally limited in
accuracy, is then used as a prior to guide our MVS network in the cost volume
generation and regularization for accurate dense depth prediction. Predicted
depth maps of keyframe images by the MVS network are incrementally fused into a
global map using TSDF-Fusion. We extensively evaluate both the proposed
SPA-MVSNet and the entire visual-inertial dense mapping system on several
public datasets as well as our own dataset, demonstrating the system's
impressive generalization capabilities and its ability to deliver high-quality
3D mesh reconstruction online. Our proposed dense mapping system achieves a
39.7% improvement in F-score over existing systems when evaluated on the
challenging scenarios of the EuRoC dataset.
Related papers
- SeMLaPS: Real-time Semantic Mapping with Latent Prior Networks and
Quasi-Planar Segmentation [53.83313235792596]
We present a new methodology for real-time semantic mapping from RGB-D sequences.
It combines a 2D neural network and a 3D network based on a SLAM system with 3D occupancy mapping.
Our system achieves state-of-the-art semantic mapping quality within 2D-3D networks-based systems.
arXiv Detail & Related papers (2023-06-28T22:36:44Z) - Multi-View Guided Multi-View Stereo [39.116228971420874]
This paper introduces a novel deep framework for dense 3D reconstruction from multiple image frames.
Given a deep multi-view stereo network, our framework uses sparse depth hints to guide the neural network.
We evaluate our Multi-View Guided framework within a variety of state-of-the-art deep multi-view stereo networks.
arXiv Detail & Related papers (2022-10-20T17:59:18Z) - DeepFusion: Real-Time Dense 3D Reconstruction for Monocular SLAM using
Single-View Depth and Gradient Predictions [22.243043857097582]
DeepFusion is capable of producing real-time dense reconstructions on a GPU.
It fuses the output of a semi-dense multiview stereo algorithm with the depth and predictions of a CNN in a probabilistic fashion.
Based on its performance on synthetic and real-world datasets, we demonstrate that DeepFusion is capable of performing at least as well as other comparable systems.
arXiv Detail & Related papers (2022-07-25T14:55:26Z) - 3DVNet: Multi-View Depth Prediction and Volumetric Refinement [68.68537312256144]
3DVNet is a novel multi-view stereo (MVS) depth-prediction method.
Our key idea is the use of a 3D scene-modeling network that iteratively updates a set of coarse depth predictions.
We show that our method exceeds state-of-the-art accuracy in both depth prediction and 3D reconstruction metrics.
arXiv Detail & Related papers (2021-12-01T00:52:42Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction [71.83308989022635]
In this paper, we advocate that replicating the traditional two stages framework with deep neural networks improves both the interpretability and the accuracy of the results.
Our network operates in two steps: 1) the local computation of the local depth maps with a deep MVS technique, and, 2) the depth maps and images' features fusion to build a single TSDF volume.
In order to improve the matching performance between images acquired from very different viewpoints, we introduce a rotation-invariant 3D convolution kernel called PosedConv.
arXiv Detail & Related papers (2021-08-19T11:33:58Z) - PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View
Depth Estimation with Neural Positional Encoding and Distilled Matting Loss [49.66736599668501]
We propose a self-supervised single-view pixel-level accurate depth estimation network, called PLADE-Net.
Our method shows unprecedented accuracy levels, exceeding 95% in terms of the $delta1$ metric on the KITTI dataset.
arXiv Detail & Related papers (2021-03-12T15:54:46Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.