A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving
- URL: http://arxiv.org/abs/2108.07736v1
- Date: Tue, 17 Aug 2021 16:13:01 GMT
- Title: A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving
- Authors: Louis Gallagher, Varun Ravi Kumar, Senthil Yogamani and John B.
McDonald
- Abstract summary: We reconstruct a dense 3D model of the geometry of an outdoor environment using a single monocular camera attached to a moving vehicle.
Our system employs dense depth prediction with a hybrid mapping architecture combining state-of-the-art sparse features and dense fusion-based visual SLAM algorithms.
- Score: 0.5735035463793008
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a system for incrementally reconstructing a dense
3D model of the geometry of an outdoor environment using a single monocular
camera attached to a moving vehicle. Dense models provide a rich representation
of the environment facilitating higher-level scene understanding, perception,
and planning. Our system employs dense depth prediction with a hybrid mapping
architecture combining state-of-the-art sparse features and dense fusion-based
visual SLAM algorithms within an integrated framework. Our novel contributions
include design of hybrid sparse-dense camera tracking and loop closure, and
scale estimation improvements in dense depth prediction. We use the motion
estimates from the sparse method to overcome the large and variable inter-frame
displacement typical of outdoor vehicle scenarios. Our system then registers
the live image with the dense model using whole-image alignment. This enables
the fusion of the live frame and dense depth prediction into the model. Global
consistency and alignment between the sparse and dense models are achieved by
applying pose constraints from the sparse method directly within the
deformation of the dense model. We provide qualitative and quantitative results
for both trajectory estimation and surface reconstruction accuracy,
demonstrating competitive performance on the KITTI dataset. Qualitative results
of the proposed approach are illustrated in https://youtu.be/Pn2uaVqjskY.
Source code for the project is publicly available at the following repository
https://github.com/robotvisionmu/DenseMonoSLAM.
Related papers
- Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata [70.9375320609781]
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV)
We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
arXiv Detail & Related papers (2024-06-12T14:56:56Z) - MV-DeepSDF: Implicit Modeling with Multi-Sweep Point Clouds for 3D
Vehicle Reconstruction in Autonomous Driving [25.088617195439344]
We propose a novel framework, dubbed MV-DeepSDF, which estimates the optimal Signed Distance Function (SDF) shape representation from multi-sweep point clouds.
We conduct thorough experiments on two real-world autonomous driving datasets.
arXiv Detail & Related papers (2023-08-21T15:48:15Z) - Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory
Forecasting [0.0]
We introduce a hierarchical latent structure into a VAE-based trajectory forecasting model.
Our model is capable of generating clear multi-modal trajectory distributions and outperforms the state-of-the-art (SOTA) models in terms of prediction accuracy.
arXiv Detail & Related papers (2022-07-11T04:52:28Z) - TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view
Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework.
For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments.
TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z) - Depth-conditioned Dynamic Message Propagation for Monocular 3D Object
Detection [86.25022248968908]
We learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection.
We show state-of-the-art results among the monocular-based approaches on the KITTI benchmark dataset.
arXiv Detail & Related papers (2021-03-30T16:20:24Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - A Compact Deep Architecture for Real-time Saliency Prediction [42.58396452892243]
Saliency models aim to imitate the attention mechanism in the human visual system.
Deep models have a high number of parameters which makes them less suitable for real-time applications.
Here we propose a compact yet fast model for real-time saliency prediction.
arXiv Detail & Related papers (2020-08-30T17:47:16Z) - PerMO: Perceiving More at Once from a Single Image for Autonomous
Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image.
Our approach combines the strengths of deep learning and the elegance of traditional techniques.
We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z) - Variational State-Space Models for Localisation and Dense 3D Mapping in
6 DoF [17.698319441265223]
We solve the problem of 6-DoF localisation and 3D dense reconstruction in spatial environments as approximate Bayesian inference in a deep state-space model.
This results in an expressive predictive model of the world, often missing in current state-of-the-art visual SLAM solutions.
We evaluate our approach on realistic unmanned aerial vehicle flight data, nearing the performance of state-of-the-art visual-inertial odometry systems.
arXiv Detail & Related papers (2020-06-17T22:06:35Z) - 6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal
Inference [67.70859730448473]
We present a multimodal camera relocalization framework that captures ambiguities and uncertainties.
We predict multiple camera pose hypotheses as well as the respective uncertainty for each prediction.
We introduce a new dataset specifically designed to foster camera localization research in ambiguous environments.
arXiv Detail & Related papers (2020-04-09T20:55:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.