NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU
- URL: http://arxiv.org/abs/2405.07392v2
- Date: Mon, 16 Sep 2024 09:33:17 GMT
- Title: NGD-SLAM: Towards Real-Time Dynamic SLAM without GPU
- Authors: Yuhao Zhang, Mihai Bujanca, Mikel Luján,
- Abstract summary: This paper proposes an open-source real-time dynamic SLAM system that runs solely on CPU by incorporating a mask prediction mechanism.
Our system maintains high localization accuracy in dynamic environments while achieving a tracking frame rate of 56 FPS on a laptop CPU.
- Score: 4.959552873584984
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing SLAM (Simultaneous Localization and Mapping) algorithms have achieved remarkable localization accuracy in dynamic environments by using deep learning techniques to identify dynamic objects. However, they usually require GPUs to operate in real-time. Therefore, this paper proposes an open-source real-time dynamic SLAM system that runs solely on CPU by incorporating a mask prediction mechanism, which allows the deep learning method and the camera tracking to run entirely in parallel at different frequencies. Our SLAM system further introduces a dual-stage optical flow tracking approach and employs a hybrid usage of optical flow and ORB features, enhancing efficiency and robustness by selectively allocating computational resources to input frames. Compared with previous methods, our system maintains high localization accuracy in dynamic environments while achieving a tracking frame rate of 56 FPS on a laptop CPU, proving that deep learning methods are feasible for dynamic SLAM without GPU support. To the best of our knowledge, this is the first SLAM system to achieve this.
Related papers
- LEG-SLAM: Real-Time Language-Enhanced Gaussian Splatting for SLAM [0.0]
LEG-SLAM is a novel approach that fuses an optimized Gaussian Splatting implementation with visual-language feature extraction.<n>Our method simultaneously generates high-quality photorealistic images and semantically labeled scene maps.<n>With its potential applications in autonomous robotics, augmented reality, and other interactive domains, LEG-SLAM represents a significant step forward in real-time semantic 3D Gaussian-based SLAM.
arXiv Detail & Related papers (2025-06-03T16:51:59Z) - Towards Low-Latency Event Stream-based Visual Object Tracking: A Slow-Fast Approach [32.91982063297922]
We propose a novel Slow-Fast Tracking paradigm that flexibly adapts to different operational requirements, termed SFTrack.<n>The proposed framework supports two complementary modes, i.e., a high-precision slow tracker for scenarios with sufficient computational resources, and an efficient fast tracker tailored for latency-aware, resource-constrained environments.<n>Our framework first performs graph-based representation learning from high-temporal-resolution event streams, and then integrates the learned graph-structured information into two FlashAttention-based vision backbones.
arXiv Detail & Related papers (2025-05-19T09:37:23Z) - WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments [48.51530726697405]
We present WildGS-SLAM, a robust and efficient monocular RGB SLAM system designed to handle dynamic environments.
We introduce an uncertainty map, predicted by a shallow multi-layer perceptron and DINOv2 features, to guide dynamic object removal during both tracking and mapping.
Results showcase WildGS-SLAM's superior performance in dynamic environments compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-04T19:19:40Z) - GeoFlow-SLAM: A Robust Tightly-Coupled RGBD-Inertial Fusion SLAM for Dynamic Legged Robotics [12.041115472752594]
GeoFlow-SLAM is a robust and effective Tightly-Coupled RGBD-inertial SLAM for legged robots operating in highly dynamic environments.
The proposed algorithms achieve state-of-the-art on collected legged robots and open-source datasets.
arXiv Detail & Related papers (2025-03-18T13:35:49Z) - DK-SLAM: Monocular Visual SLAM with Deep Keypoint Learning, Tracking and Loop-Closing [13.50980509878613]
Experimental evaluations on publicly available datasets demonstrate that DK-SLAM outperforms leading traditional and learning based SLAM systems.
Our system employs a Model-Agnostic Meta-Learning (MAML) strategy to optimize the training of keypoint extraction networks.
To mitigate cumulative positioning errors, DK-SLAM incorporates a novel online learning module that utilizes binary features for loop closure detection.
arXiv Detail & Related papers (2024-01-17T12:08:30Z) - NID-SLAM: Neural Implicit Representation-based RGB-D SLAM in dynamic environments [9.706447888754614]
We present NID-SLAM, which significantly improves the performance of neural SLAM in dynamic environments.
We propose a new approach to enhance inaccurate regions in semantic masks, particularly in marginal areas.
We also introduce a selection strategy for dynamic scenes, which enhances camera tracking robustness against large-scale objects.
arXiv Detail & Related papers (2024-01-02T12:35:03Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - DH-PTAM: A Deep Hybrid Stereo Events-Frames Parallel Tracking And Mapping System [1.443696537295348]
This paper presents a robust approach for a visual parallel tracking and mapping (PTAM) system that excels in challenging environments.
Our proposed method combines the strengths of heterogeneous multi-modal visual sensors, in a unified reference frame.
Our implementation's research-based Python API is publicly available on GitHub.
arXiv Detail & Related papers (2023-06-02T19:52:13Z) - NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM [111.83168930989503]
NICER-SLAM is a dense RGB SLAM system that simultaneously optimize for camera poses and a hierarchical neural implicit map representation.
We show strong performance in dense mapping, tracking, and novel view synthesis, even competitive with recent RGB-D SLAM systems.
arXiv Detail & Related papers (2023-02-07T17:06:34Z) - Using Detection, Tracking and Prediction in Visual SLAM to Achieve
Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU.
We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z) - D$^3$FlowSLAM: Self-Supervised Dynamic SLAM with Flow Motion Decomposition and DINO Guidance [61.14088096348959]
We introduce a self-supervised deep SLAM method that robustly operates in dynamic scenes while accurately identifying dynamic components.
We propose a dynamic update module based on this representation and develop a dense SLAM system that excels in dynamic scenarios.
arXiv Detail & Related papers (2022-07-18T17:47:39Z) - Hierarchical Neural Dynamic Policies [50.969565411919376]
We tackle the problem of generalization to unseen configurations for dynamic tasks in the real world while learning from high-dimensional image input.
We use hierarchical deep policy learning framework called Hierarchical Neural Dynamical Policies (H-NDPs)
H-NDPs form a curriculum by learning local dynamical system-based policies on small regions in state-space.
We show that H-NDPs are easily integrated with both imitation as well as reinforcement learning setups and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-12T17:59:58Z) - Learning to Continuously Optimize Wireless Resource in a Dynamic
Environment: A Bilevel Optimization Perspective [52.497514255040514]
This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment.
We propose to build the notion of continual learning into wireless system design, so that the learning model can incrementally adapt to the new episodes.
Our design is based on a novel bilevel optimization formulation which ensures certain fairness" across different data samples.
arXiv Detail & Related papers (2021-05-03T07:23:39Z) - OV$^{2}$SLAM : A Fully Online and Versatile Visual SLAM for Real-Time
Applications [59.013743002557646]
We describe OV$2$SLAM, a fully online algorithm, handling both monocular and stereo camera setups, various map scales and frame-rates ranging from a few Hertz up to several hundreds.
For the benefit of the community, we release the source code: urlhttps://github.com/ov2slam/ov2slam.
arXiv Detail & Related papers (2021-02-08T08:39:23Z) - DynaSLAM II: Tightly-Coupled Multi-Object Tracking and SLAM [2.9822184411723645]
DynaSLAM II is a visual SLAM system for stereo and RGB-D configurations that tightly integrates the multi-object tracking capability.
We demonstrate that tracking dynamic objects does not only provide rich clues for scene understanding but is also beneficial for camera tracking.
arXiv Detail & Related papers (2020-10-15T15:25:30Z) - DOT: Dynamic Object Tracking for Visual SLAM [83.69544718120167]
DOT combines instance segmentation and multi-view geometry to generate masks for dynamic objects.
To determine which objects are actually moving, DOT segments first instances of potentially dynamic objects and then, with the estimated camera motion, tracks such objects by minimizing the photometric reprojection error.
Our results show that our approach improves significantly the accuracy and robustness of ORB-SLAM 2, especially in highly dynamic scenes.
arXiv Detail & Related papers (2020-09-30T18:36:28Z) - DXSLAM: A Robust and Efficient Visual SLAM System with Deep Features [5.319556638040589]
This paper shows that feature extraction with deep convolutional neural networks (CNNs) can be seamlessly incorporated into a modern SLAM framework.
The proposed SLAM system utilizes a state-of-the-art CNN to detect keypoints in each image frame, and to give not only keypoint descriptors, but also a global descriptor of the whole image.
arXiv Detail & Related papers (2020-08-12T16:14:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.