DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features
- URL: http://arxiv.org/abs/2008.05416v1
- Date: Wed, 12 Aug 2020 16:14:46 GMT
- Title: DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features
- Authors: Dongjiang Li, Xuesong Shi, Qiwei Long, Shenghui Liu, Wei Yang, Fangshi
Wang, Qi Wei, Fei Qiao
- Abstract summary: This paper shows that feature extraction with deep convolutional neural networks (CNNs) can be seamlessly incorporated into a modern SLAM framework.
The proposed SLAM system utilizes a state-of-the-art CNN to detect keypoints in each image frame, and to give not only keypoint descriptors, but also a global descriptor of the whole image.
- Score: 5.319556638040589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A robust and efficient Simultaneous Localization and Mapping (SLAM) system is
essential for robot autonomy. For visual SLAM algorithms, though the
theoretical framework has been well established for most aspects, feature
extraction and association is still empirically designed in most cases, and can
be vulnerable in complex environments. This paper shows that feature extraction
with deep convolutional neural networks (CNNs) can be seamlessly incorporated
into a modern SLAM framework. The proposed SLAM system utilizes a
state-of-the-art CNN to detect keypoints in each image frame, and to give not
only keypoint descriptors, but also a global descriptor of the whole image.
These local and global features are then used by different SLAM modules,
resulting in much more robustness against environmental changes and viewpoint
changes compared with using hand-crafted features. We also train a visual
vocabulary of local features with a Bag of Words (BoW) method. Based on the
local features, global features, and the vocabulary, a highly reliable loop
closure detection method is built. Experimental results show that all the
proposed modules significantly outperforms the baseline, and the full system
achieves much lower trajectory errors and much higher correct rates on all
evaluated data. Furthermore, by optimizing the CNN with Intel OpenVINO toolkit
and utilizing the Fast BoW library, the system benefits greatly from the SIMD
(single-instruction-multiple-data) techniques in modern CPUs. The full system
can run in real-time without any GPU or other accelerators. The code is public
at https://github.com/ivipsourcecode/dxslam.
Related papers
- Loopy-SLAM: Dense Neural SLAM with Loop Closures [53.11936461015725]
We introduce Loopy-SLAM that globally optimize poses and the dense 3D model.
We use frame-to-model tracking using a data-driven point-based submap generation method and trigger loop closures online by performing global place recognition.
Evaluation on the synthetic Replica and real-world TUM-RGBD and ScanNet datasets demonstrate competitive or superior performance in tracking, mapping, and rendering accuracy when compared to existing dense neural RGBD SLAM methods.
arXiv Detail & Related papers (2024-02-14T18:18:32Z) - DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking and Loop-Closing [13.50980509878613]
Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning based SLAM systems.
Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks.
To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection.
arXiv Detail & Related papers (2024-01-17T12:08:30Z) - PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Representation for Achieving Global Map Consistency [30.5868776990673]
PIN-SLAM is a system for building globally consistent maps based on an elastic and compact point-based implicit neural map representation.
Our implicit map is based on sparse optimizable neural points, which are inherently elastic and deformable with the global pose adjustment when closing a loop.
PIN-SLAM achieves pose estimation accuracy better or on par with the state-of-the-art LiDAR odometry or SLAM systems.
arXiv Detail & Related papers (2024-01-17T10:06:12Z) - Real-time Local Feature with Global Visual Information Enhancement [6.640269424085467]
Current deep learning-based local feature algorithms always utilize convolution neural network (CNN) architecture with limited receptive field.
The proposed method introduces a global enhancement module to fuse global visual clues in a light-weight network.
Experiments on the public benchmarks demonstrate that the proposal can achieve considerable robustness against visual interference and meanwhile run in real time.
arXiv Detail & Related papers (2022-11-20T13:44:20Z) - Using Detection, Tracking and Prediction in Visual SLAM to Achieve
Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU.
We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z) - ZippyPoint: Fast Interest Point Detection, Description, and Matching
through Mixed Precision Discretization [71.91942002659795]
We investigate and adapt network quantization techniques to accelerate inference and enable its use on compute limited platforms.
ZippyPoint, our efficient quantized network with binary descriptors, improves the network runtime speed, the descriptor matching speed, and the 3D model size.
These improvements come at a minor performance degradation as evaluated on the tasks of homography estimation, visual localization, and map-free visual relocalization.
arXiv Detail & Related papers (2022-03-07T18:59:03Z) - NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [112.6093688226293]
NICE-SLAM is a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation.
Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust.
arXiv Detail & Related papers (2021-12-22T18:45:44Z) - Greedy-Based Feature Selection for Efficient LiDAR SLAM [12.257338124961622]
This paper demonstrates that actively selecting a subset of features significantly improves both the accuracy and efficiency of an L-SLAM system.
We show that our approach exhibits low localization error and speedup compared to the state-of-the-art L-SLAM systems.
arXiv Detail & Related papers (2021-03-24T11:03:16Z) - OV$^{2}$SLAM : A Fully Online and Versatile Visual SLAM for Real-Time
Applications [59.013743002557646]
We describe OV$2$SLAM, a fully online algorithm, handling both monocular and stereo camera setups, various map scales and frame-rates ranging from a few Hertz up to several hundreds.
For the benefit of the community, we release the source code: urlhttps://github.com/ov2slam/ov2slam.
arXiv Detail & Related papers (2021-02-08T08:39:23Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.