Related papers: A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving

A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving

URL: http://arxiv.org/abs/2108.07736v1
Date: Tue, 17 Aug 2021 16:13:01 GMT
Title: A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving
Authors: Louis Gallagher, Varun Ravi Kumar, Senthil Yogamani and John B. McDonald
Abstract summary: We reconstruct a dense 3D model of the geometry of an outdoor environment using a single monocular camera attached to a moving vehicle. Our system employs dense depth prediction with a hybrid mapping architecture combining state-of-the-art sparse features and dense fusion-based visual SLAM algorithms.
Score: 0.5735035463793008
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we present a system for incrementally reconstructing a dense 3D model of the geometry of an outdoor environment using a single monocular camera attached to a moving vehicle. Dense models provide a rich representation of the environment facilitating higher-level scene understanding, perception, and planning. Our system employs dense depth prediction with a hybrid mapping architecture combining state-of-the-art sparse features and dense fusion-based visual SLAM algorithms within an integrated framework. Our novel contributions include design of hybrid sparse-dense camera tracking and loop closure, and scale estimation improvements in dense depth prediction. We use the motion estimates from the sparse method to overcome the large and variable inter-frame displacement typical of outdoor vehicle scenarios. Our system then registers the live image with the dense model using whole-image alignment. This enables the fusion of the live frame and dense depth prediction into the model. Global consistency and alignment between the sparse and dense models are achieved by applying pose constraints from the sparse method directly within the deformation of the dense model. We provide qualitative and quantitative results for both trajectory estimation and surface reconstruction accuracy, demonstrating competitive performance on the KITTI dataset. Qualitative results of the proposed approach are illustrated in https://youtu.be/Pn2uaVqjskY. Source code for the project is publicly available at the following repository https://github.com/robotvisionmu/DenseMonoSLAM.

Related papers

EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization. We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
Semi-SD: Semi-Supervised Metric Depth Estimation via Surrounding Cameras for Autonomous Driving [20.19617659712535]
Semi-SD is a novel metric depth estimation framework tailored for surrounding cameras equipment in autonomous driving. We propose a unified spatial-temporal-semantic fusion module to construct the visual fused features. We evaluate our algorithm on DDAD and nuScenes datasets, and the results demonstrate that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2025-03-25T14:39:04Z)
Geometry-Constrained Monocular Scale Estimation Using Semantic Segmentation for Dynamic Scenes [3.635236692041662]
This study presents innovative strategies for ego-motion estimation and the selection of ground points. Our methodology incorporates dy-namic object masks to eliminate unstable features and employs ground plane masks for meticulous triangulation. The integration of this approach with the mo-nocular version of ORB-SLAM3 culminates in the accurate esti-mation of a road model.
arXiv Detail & Related papers (2025-03-06T09:15:13Z)
Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata [70.9375320609781]
We aim to generate fine-grained 3D geometry from large-scale sparse LiDAR scans, abundantly captured by autonomous vehicles (AV) We propose hierarchical Generative Cellular Automata (hGCA), a spatially scalable 3D generative model, which grows geometry with local kernels following, in a coarse-to-fine manner, equipped with a light-weight planner to induce global consistency.
arXiv Detail & Related papers (2024-06-12T14:56:56Z)
MV-DeepSDF: Implicit Modeling with Multi-Sweep Point Clouds for 3D Vehicle Reconstruction in Autonomous Driving [25.088617195439344]
We propose a novel framework, dubbed MV-DeepSDF, which estimates the optimal Signed Distance Function (SDF) shape representation from multi-sweep point clouds. We conduct thorough experiments on two real-world autonomous driving datasets.
arXiv Detail & Related papers (2023-08-21T15:48:15Z)
Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting [0.0]
We introduce a hierarchical latent structure into a VAE-based trajectory forecasting model. Our model is capable of generating clear multi-modal trajectory distributions and outperforms the state-of-the-art (SOTA) models in terms of prediction accuracy.
arXiv Detail & Related papers (2022-07-11T04:52:28Z)
TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [55.30992853477754]
We present TANDEM, a real-time monocular tracking and dense framework. For pose estimation, TANDEM performs photometric bundle adjustment based on a sliding window of alignments. TANDEM shows state-of-the-art real-time 3D reconstruction performance.
arXiv Detail & Related papers (2021-11-14T19:01:02Z)
Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection [86.25022248968908]
We learn context- and depth-aware feature representation to solve the problem of monocular 3D object detection. We show state-of-the-art results among the monocular-based approaches on the KITTI benchmark dataset.
arXiv Detail & Related papers (2021-03-30T16:20:24Z)
Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision. Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z)
A Compact Deep Architecture for Real-time Saliency Prediction [42.58396452892243]
Saliency models aim to imitate the attention mechanism in the human visual system. Deep models have a high number of parameters which makes them less suitable for real-time applications. Here we propose a compact yet fast model for real-time saliency prediction.
arXiv Detail & Related papers (2020-08-30T17:47:16Z)
PerMO: Perceiving More at Once from a Single Image for Autonomous Driving [76.35684439949094]
We present a novel approach to detect, segment, and reconstruct complete textured 3D models of vehicles from a single image. Our approach combines the strengths of deep learning and the elegance of traditional techniques. We have integrated these algorithms with an autonomous driving system.
arXiv Detail & Related papers (2020-07-16T05:02:45Z)
Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF [17.698319441265223]
We solve the problem of 6-DoF localisation and 3D dense reconstruction in spatial environments as approximate Bayesian inference in a deep state-space model. This results in an expressive predictive model of the world, often missing in current state-of-the-art visual SLAM solutions. We evaluate our approach on realistic unmanned aerial vehicle flight data, nearing the performance of state-of-the-art visual-inertial odometry systems.
arXiv Detail & Related papers (2020-06-17T22:06:35Z)
6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference [67.70859730448473]
We present a multimodal camera relocalization framework that captures ambiguities and uncertainties. We predict multiple camera pose hypotheses as well as the respective uncertainty for each prediction. We introduce a new dataset specifically designed to foster camera localization research in ambiguous environments.
arXiv Detail & Related papers (2020-04-09T20:55:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.