Related papers: Voxel Map for Visual SLAM

Voxel Map for Visual SLAM

URL: http://arxiv.org/abs/2003.02247v1
Date: Wed, 4 Mar 2020 18:39:14 GMT
Title: Voxel Map for Visual SLAM
Authors: Manasi Muglikar, Zichao Zhang and Davide Scaramuzza
Abstract summary: We propose a voxel-map representation to efficiently map points for visual SLAM. Our method is geometrically guaranteed to fall in the camera field-of-view, and occluded points can be identified and removed to a certain extend. Experimental results show that our voxel map representation is as efficient as a map with 5s and provides significantly higher localization accuracy (average 46% improvement in RMSE) on the EuRoC dataset.
Score: 57.07800982410967
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In modern visual SLAM systems, it is a standard practice to retrieve potential candidate map points from overlapping keyframes for further feature matching or direct tracking. In this work, we argue that keyframes are not the optimal choice for this task, due to several inherent limitations, such as weak geometric reasoning and poor scalability. We propose a voxel-map representation to efficiently retrieve map points for visual SLAM. In particular, we organize the map points in a regular voxel grid. Visible points from a camera pose are queried by sampling the camera frustum in a raycasting manner, which can be done in constant time using an efficient voxel hashing method. Compared with keyframes, the retrieved points using our method are geometrically guaranteed to fall in the camera field-of-view, and occluded points can be identified and removed to a certain extend. This method also naturally scales up to large scenes and complicated multicamera configurations. Experimental results show that our voxel map representation is as efficient as a keyframe map with 5 keyframes and provides significantly higher localization accuracy (average 46% improvement in RMSE) on the EuRoC dataset. The proposed voxel-map representation is a general approach to a fundamental functionality in visual SLAM and widely applicable.

Related papers

FaVoR: Features via Voxel Rendering for Camera Relocalization [23.7893950095252]
Camera relocalization methods range from dense image alignment to direct camera pose regression from a query image. We propose a novel approach that leverages a globally sparse yet locally dense 3D representation of 2D features. By tracking and triangulating landmarks over a sequence of frames, we construct a sparse voxel map optimized to render image patch descriptors observed during tracking.
arXiv Detail & Related papers (2024-09-11T18:58:16Z)
Representing 3D sparse map points and lines for camera relocalization [1.2974519529978974]
We show how a lightweight neural network can learn to represent both 3D point and line features. In tests, our method secures a significant lead, marking the most considerable enhancement over state-of-the-art learning-based methodologies.
arXiv Detail & Related papers (2024-02-28T03:07:05Z)
Neural Implicit Dense Semantic SLAM [83.04331351572277]
We propose a novel RGBD vSLAM algorithm that learns a memory-efficient, dense 3D geometry, and semantic segmentation of an indoor scene in an online manner. Our pipeline combines classical 3D vision-based tracking and loop closing with neural fields-based mapping. Our proposed algorithm can greatly enhance scene perception and assist with a range of robot control problems.
arXiv Detail & Related papers (2023-04-27T23:03:52Z)
Dense RGB SLAM with Neural Implicit Maps [34.37572307973734]
We present a dense RGB SLAM method with neural implicit map representation. Our method simultaneously solves the camera motion and the neural implicit map by matching the rendered and input video frames. Our method achieves favorable results than previous methods and even surpasses some recent RGB-D SLAM methods.
arXiv Detail & Related papers (2023-01-21T09:54:07Z)
HPointLoc: Point-based Indoor Place Recognition using Synthetic RGB-D Images [58.720142291102135]
We present a novel dataset named as HPointLoc, specially designed for exploring capabilities of visual place recognition in indoor environment. The dataset is based on the popular Habitat simulator, in which it is possible to generate indoor scenes using both own sensor data and open datasets.
arXiv Detail & Related papers (2022-12-30T12:20:56Z)
Pixel-Perfect Structure-from-Motion with Featuremetric Refinement [96.73365545609191]
We refine two key steps of structure-from-motion by a direct alignment of low-level image information from multiple views. This significantly improves the accuracy of camera poses and scene geometry for a wide range of keypoint detectors. Our system easily scales to large image collections, enabling pixel-perfect crowd-sourced localization at scale.
arXiv Detail & Related papers (2021-08-18T17:58:55Z)
Object-Augmented RGB-D SLAM for Wide-Disparity Relocalisation [3.888848425698769]
We propose a novel object-augmented RGB-D SLAM system that is capable of constructing a consistent object map and performing relocalisation based on centroids of objects in the map.
arXiv Detail & Related papers (2021-08-05T11:02:25Z)
Accurate Grid Keypoint Learning for Efficient Video Prediction [87.71109421608232]
Keypoint-based video prediction methods can consume substantial computing resources in training and deployment. In this paper, we design a new grid keypoint learning framework, aiming at a robust and explainable intermediate keypoint representation for long-term efficient video prediction. Our method outperforms the state-ofthe-art video prediction methods while saves 98% more than computing resources.
arXiv Detail & Related papers (2021-07-28T05:04:30Z)
RPVNet: A Deep and Efficient Range-Point-Voxel Fusion Network for LiDAR Point Cloud Segmentation [28.494690309193068]
We propose a novel range-point-voxel fusion network, namely RPVNet. In this network, we devise a deep fusion framework with multiple and mutual information interactions among these three views. By leveraging this efficient interaction and relatively lower voxel resolution, our method is also proved to be more efficient.
arXiv Detail & Related papers (2021-03-24T04:24:12Z)
Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement [54.29252286561449]
We propose a two-stage graph-based and model-agnostic framework, called Graph-PCNN. In the first stage, heatmap regression network is applied to obtain a rough localization result, and a set of proposal keypoints, called guided points, are sampled. In the second stage, for each guided point, different visual feature is extracted by the localization. The relationship between guided points is explored by the graph pose refinement module to get more accurate localization results.
arXiv Detail & Related papers (2020-07-21T04:59:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.