MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation
- URL: http://arxiv.org/abs/2506.18678v1
- Date: Mon, 23 Jun 2025 14:22:29 GMT
- Title: MCN-SLAM: Multi-Agent Collaborative Neural SLAM with Hybrid Implicit Neural Scene Representation
- Authors: Tianchen Deng, Guole Shen, Xun Chen, Shenghai Yuan, Hongming Shen, Guohao Peng, Zhenyu Wu, Jingchuan Wang, Lihua Xie, Danwei Wang, Hesheng Wang, Weidong Chen,
- Abstract summary: Existing NeRF-based multi-agent SLAM frameworks cannot meet the constraints of communication bandwidth.<n>We propose the first distributed multi-agent collaborative neural SLAM framework with hybrid scene representation.<n>A novel triplane-grid joint scene representation method is proposed to improve scene reconstruction.<n>A novel intra-to-inter loop closure method is designed to achieve local (single-agent) and global (multi-agent) consistency.
- Score: 51.07118703442774
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Neural implicit scene representations have recently shown promising results in dense visual SLAM. However, existing implicit SLAM algorithms are constrained to single-agent scenarios, and fall difficulties in large-scale scenes and long sequences. Existing NeRF-based multi-agent SLAM frameworks cannot meet the constraints of communication bandwidth. To this end, we propose the first distributed multi-agent collaborative neural SLAM framework with hybrid scene representation, distributed camera tracking, intra-to-inter loop closure, and online distillation for multiple submap fusion. A novel triplane-grid joint scene representation method is proposed to improve scene reconstruction. A novel intra-to-inter loop closure method is designed to achieve local (single-agent) and global (multi-agent) consistency. We also design a novel online distillation method to fuse the information of different submaps to achieve global consistency. Furthermore, to the best of our knowledge, there is no real-world dataset for NeRF-based/GS-based SLAM that provides both continuous-time trajectories groundtruth and high-accuracy 3D meshes groundtruth. To this end, we propose the first real-world Dense slam (DES) dataset covering both single-agent and multi-agent scenarios, ranging from small rooms to large-scale outdoor scenes, with high-accuracy ground truth for both 3D mesh and continuous-time camera trajectory. This dataset can advance the development of the research in both SLAM, 3D reconstruction, and visual foundation model. Experiments on various datasets demonstrate the superiority of the proposed method in both mapping, tracking, and communication. The dataset and code will open-source on https://github.com/dtc111111/mcnslam.
Related papers
- MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM [23.318966306555915]
Simultaneous localization and mapping (SLAM) systems are widely used in computer vision, with applications in augmented reality, robotics, and autonomous driving.
Recent work has addressed this problem using a distributed neural scene representation.
We propose a rigidly deformable 3D Gaussian-based scene representation that dramatically speeds up the system.
We evaluate MAGiC-SLAM on synthetic and real-world datasets and find it more accurate and faster than the state of the art.
arXiv Detail & Related papers (2024-11-25T08:34:01Z) - NIS-SLAM: Neural Implicit Semantic RGB-D SLAM for 3D Consistent Scene Understanding [31.56016043635702]
We introduce NIS-SLAM, an efficient neural implicit semantic RGB-D SLAM system.
For high-fidelity surface reconstruction and spatial consistent scene understanding, we combine high-frequency multi-resolution tetrahedron-based features.
We also show that our approach can be used in augmented reality applications.
arXiv Detail & Related papers (2024-07-30T14:27:59Z) - Multiway Point Cloud Mosaicking with Diffusion and Global Optimization [74.3802812773891]
We introduce a novel framework for multiway point cloud mosaicking (named Wednesday)
At the core of our approach is ODIN, a learned pairwise registration algorithm that identifies overlaps and refines attention scores.
Tested on four diverse, large-scale datasets, our method state-of-the-art pairwise and rotation registration results by a large margin on all benchmarks.
arXiv Detail & Related papers (2024-03-30T17:29:13Z) - MUTE-SLAM: Real-Time Neural SLAM with Multiple Tri-Plane Hash Representations [6.266208986510979]
MUTE-SLAM is a real-time neural RGB-D SLAM system employing multiple tri-plane hash-encodings for efficient scene representation.
MUTE-SLAM effectively tracks camera positions and incrementally builds a scalable multi-map representation for both small and large indoor environments.
arXiv Detail & Related papers (2024-03-26T14:53:24Z) - Q-SLAM: Quadric Representations for Monocular SLAM [85.82697759049388]
We reimagine volumetric representations through the lens of quadrics.
We use quadric assumption to rectify noisy depth estimations from RGB inputs.
We introduce a novel quadric-decomposed transformer to aggregate information across quadrics.
arXiv Detail & Related papers (2024-03-12T23:27:30Z) - DNS SLAM: Dense Neural Semantic-Informed SLAM [92.39687553022605]
DNS SLAM is a novel neural RGB-D semantic SLAM approach featuring a hybrid representation.
Our method integrates multi-view geometry constraints with image-based feature extraction to improve appearance details.
Our experimental results achieve state-of-the-art performance on both synthetic data and real-world data tracking.
arXiv Detail & Related papers (2023-11-30T21:34:44Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - CP-SLAM: Collaborative Neural Point-based SLAM System [54.916578456416204]
This paper presents a collaborative implicit neural localization and mapping (SLAM) system with RGB-D image sequences.
In order to enable all these modules in a unified framework, we propose a novel neural point based 3D scene representation.
A distributed-to-centralized learning strategy is proposed for the collaborative implicit SLAM to improve consistency and cooperation.
arXiv Detail & Related papers (2023-11-14T09:17:15Z) - ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of
Signed Distance Fields [2.0625936401496237]
ESLAM reads RGB-D frames with unknown camera poses in a sequential manner and incrementally reconstructs the scene representation.
ESLAM improves the accuracy of 3D reconstruction and camera localization of state-of-the-art dense visual SLAM methods by more than 50%.
arXiv Detail & Related papers (2022-11-21T18:25:14Z) - NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [112.6093688226293]
NICE-SLAM is a dense SLAM system that incorporates multi-level local information by introducing a hierarchical scene representation.
Compared to recent neural implicit SLAM systems, our approach is more scalable, efficient, and robust.
arXiv Detail & Related papers (2021-12-22T18:45:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.