UP-SLAM: Adaptively Structured Gaussian SLAM with Uncertainty Prediction in Dynamic Environments
- URL: http://arxiv.org/abs/2505.22335v1
- Date: Wed, 28 May 2025 13:23:16 GMT
- Title: UP-SLAM: Adaptively Structured Gaussian SLAM with Uncertainty Prediction in Dynamic Environments
- Authors: Wancai Zheng, Linlin Ou, Jiajie He, Libo Zhou, Xinyi Yu, Yan Wei,
- Abstract summary: UP-SLAM is a real-time RGB-D SLAM system for dynamic environments.<n>It decouples tracking and mapping through a parallelized framework.<n>Experiments suggest it outperforms state-of-the-art methods in both localization accuracy and rendering quality.
- Score: 1.4117999181375773
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent 3D Gaussian Splatting (3DGS) techniques for Visual Simultaneous Localization and Mapping (SLAM) have significantly progressed in tracking and high-fidelity mapping. However, their sequential optimization framework and sensitivity to dynamic objects limit real-time performance and robustness in real-world scenarios. We present UP-SLAM, a real-time RGB-D SLAM system for dynamic environments that decouples tracking and mapping through a parallelized framework. A probabilistic octree is employed to manage Gaussian primitives adaptively, enabling efficient initialization and pruning without hand-crafted thresholds. To robustly filter dynamic regions during tracking, we propose a training-free uncertainty estimator that fuses multi-modal residuals to estimate per-pixel motion uncertainty, achieving open-set dynamic object handling without reliance on semantic labels. Furthermore, a temporal encoder is designed to enhance rendering quality. Concurrently, low-dimensional features are efficiently transformed via a shallow multilayer perceptron to construct DINO features, which are then employed to enrich the Gaussian field and improve the robustness of uncertainty prediction. Extensive experiments on multiple challenging datasets suggest that UP-SLAM outperforms state-of-the-art methods in both localization accuracy (by 59.8%) and rendering quality (by 4.57 dB PSNR), while maintaining real-time performance and producing reusable, artifact-free static maps in dynamic environments.The project: https://aczheng-cai.github.io/up_slam.github.io/
Related papers
- WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments [48.51530726697405]
We present WildGS-SLAM, a robust and efficient monocular RGB SLAM system designed to handle dynamic environments.<n>We introduce an uncertainty map, predicted by a shallow multi-layer perceptron and DINOv2 features, to guide dynamic object removal during both tracking and mapping.<n>Results showcase WildGS-SLAM's superior performance in dynamic environments compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-04T19:19:40Z) - GARAD-SLAM: 3D GAussian splatting for Real-time Anti Dynamic SLAM [9.060527946525381]
We propose GARAD-SLAM, a real-time 3DGS-based SLAM system tailored for dynamic scenes.<n>In terms of tracking, we directly perform dynamic segmentation on Gaussians and map them back to the front-end to obtain dynamic point labels.<n>Our results on real-world datasets demonstrate that our method is competitive in tracking compared to baseline methods.
arXiv Detail & Related papers (2025-02-05T14:44:17Z) - UrbanGS: Semantic-Guided Gaussian Splatting for Urban Scene Reconstruction [86.4386398262018]
UrbanGS uses 2D semantic maps and an existing dynamic Gaussian approach to distinguish static objects from the scene.<n>For potentially dynamic objects, we aggregate temporal information using learnable time embeddings.<n>Our approach outperforms state-of-the-art methods in reconstruction quality and efficiency.
arXiv Detail & Related papers (2024-12-04T16:59:49Z) - DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes [71.61083731844282]
We present DeSiRe-GS, a self-supervised gaussian splatting representation.<n>It enables effective static-dynamic decomposition and high-fidelity surface reconstruction in complex driving scenarios.
arXiv Detail & Related papers (2024-11-18T05:49:16Z) - KFD-NeRF: Rethinking Dynamic NeRF with Kalman Filter [49.85369344101118]
We introduce KFD-NeRF, a novel dynamic neural radiance field integrated with an efficient and high-quality motion reconstruction framework based on Kalman filtering.
Our key idea is to model the dynamic radiance field as a dynamic system whose temporally varying states are estimated based on two sources of knowledge: observations and predictions.
Our KFD-NeRF demonstrates similar or even superior performance within comparable computational time and state-of-the-art view synthesis performance with thorough training.
arXiv Detail & Related papers (2024-07-18T05:48:24Z) - CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field [46.8198987091734]
This paper presents an efficient dense RGB-D SLAM system, i.e., CG-SLAM, based on a novel uncertainty-aware 3D Gaussian field.
Experiments on various datasets demonstrate that CG-SLAM achieves superior tracking and mapping performance with a notable tracking speed of up to 15 Hz.
arXiv Detail & Related papers (2024-03-24T11:19:59Z) - DDN-SLAM: Real-time Dense Dynamic Neural Implicit SLAM [5.267859554944985]
We introduce DDN-SLAM, the first real-time dense dynamic neural implicit SLAM system integrating semantic features.
Compared to existing neural implicit SLAM systems, the tracking results on dynamic datasets indicate an average 90% improvement in Average Trajectory Error (ATE) accuracy.
arXiv Detail & Related papers (2024-01-03T05:42:17Z) - NID-SLAM: Neural Implicit Representation-based RGB-D SLAM in dynamic environments [9.706447888754614]
We present NID-SLAM, which significantly improves the performance of neural SLAM in dynamic environments.
We propose a new approach to enhance inaccurate regions in semantic masks, particularly in marginal areas.
We also introduce a selection strategy for dynamic scenes, which enhances camera tracking robustness against large-scale objects.
arXiv Detail & Related papers (2024-01-02T12:35:03Z) - EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via
Self-Supervision [85.17951804790515]
EmerNeRF is a simple yet powerful approach for learning spatial-temporal representations of dynamic driving scenes.
It simultaneously captures scene geometry, appearance, motion, and semantics via self-bootstrapping.
Our method achieves state-of-the-art performance in sensor simulation.
arXiv Detail & Related papers (2023-11-03T17:59:55Z) - Using Detection, Tracking and Prediction in Visual SLAM to Achieve
Real-time Semantic Mapping of Dynamic Scenarios [70.70421502784598]
RDS-SLAM can build semantic maps at object level for dynamic scenarios in real time using only one commonly used Intel Core i7 CPU.
We evaluate RDS-SLAM in TUM RGB-D dataset, and experimental results show that RDS-SLAM can run with 30.3 ms per frame in dynamic scenarios.
arXiv Detail & Related papers (2022-10-10T11:03:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.