Related papers: Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance

Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance

URL: http://arxiv.org/abs/2412.02249v1
Date: Tue, 03 Dec 2024 08:27:17 GMT
Title: Multi-robot autonomous 3D reconstruction using Gaussian splatting with Semantic guidance
Authors: Jing Zeng, Qi Ye, Tianle Liu, Yang Xu, Jin Li, Jinming Xu, Liang Li, Jiming Chen,
Abstract summary: Implicit neural representations and 3D Gaussian splatting (3DGS) have shown great potential for scene reconstruction.<n>Recent studies have expanded their applications in autonomous reconstruction through task assignment methods.<n>We propose the first 3DGS-based centralized multi-robot autonomous 3D reconstruction framework.
Score: 18.631273098468384
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Implicit neural representations and 3D Gaussian splatting (3DGS) have shown great potential for scene reconstruction. Recent studies have expanded their applications in autonomous reconstruction through task assignment methods. However, these methods are mainly limited to single robot, and rapid reconstruction of large-scale scenes remains challenging. Additionally, task-driven planning based on surface uncertainty is prone to being trapped in local optima. To this end, we propose the first 3DGS-based centralized multi-robot autonomous 3D reconstruction framework. To further reduce time cost of task generation and improve reconstruction quality, we integrate online open-vocabulary semantic segmentation with surface uncertainty of 3DGS, focusing view sampling on regions with high instance uncertainty. Finally, we develop a multi-robot collaboration strategy with mode and task assignments improving reconstruction quality while ensuring planning efficiency. Our method demonstrates the highest reconstruction quality among all planning methods and superior planning efficiency compared to existing multi-robot methods. We deploy our method on multiple robots, and results show that it can effectively plan view paths and reconstruct scenes with high quality.

Related papers

SeqAffordSplat: Scene-level Sequential Affordance Reasoning on 3D Gaussian Splatting [85.87902260102652]
We introduce the novel task of Sequential 3D Gaussian Affordance Reasoning.<n>We then propose SeqSplatNet, an end-to-end framework that directly maps an instruction to a sequence of 3D affordance masks.<n>Our method sets a new state-of-the-art on our challenging benchmark, effectively advancing affordance reasoning from single-step interactions to complex, sequential tasks at the scene level.
arXiv Detail & Related papers (2025-07-31T17:56:55Z)
Regist3R: Incremental Registration with Stereo Foundation Model [11.220655907305515]
Multi-view 3D reconstruction has remained an essential yet challenging problem in the field of computer vision. We propose Regist3R, a novel stereo foundation model tailored for efficient and scalable incremental reconstruction. We evaluate Regist3R on public datasets for camera pose estimation and 3D reconstruction.
arXiv Detail & Related papers (2025-04-16T02:46:53Z)
FreeSplat++: Generalizable 3D Gaussian Splatting for Efficient Indoor Scene Reconstruction [50.534213038479926]
FreeSplat++ is an alternative approach to large-scale indoor whole-scene reconstruction. Our method with depth-regularized per-scene fine-tuning demonstrates substantial improvements in reconstruction accuracy and a notable reduction in training time.
arXiv Detail & Related papers (2025-03-29T06:22:08Z)
LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds [21.99354901986186]
We propose LHM (Large Animatable Human Reconstruction Model) to infer high-fidelity avatars represented as 3D Gaussian splatting in a feed-forward pass. Our model leverages a multimodal transformer architecture to effectively encode the human body positional features and image features with attention mechanism. Our LHM generates plausible animatable human in seconds without post-processing for face and hands, outperforming existing methods in both reconstruction accuracy and generalization ability.
arXiv Detail & Related papers (2025-03-13T17:59:21Z)
T-3DGS: Removing Transient Objects for 3D Scene Reconstruction [83.05271859398779]
Transient objects in video sequences can significantly degrade the quality of 3D scene reconstructions. We propose T-3DGS, a novel framework that robustly filters out transient distractors during 3D reconstruction using Gaussian Splatting.
arXiv Detail & Related papers (2024-11-29T07:45:24Z)
UW-SDF: Exploiting Hybrid Geometric Priors for Neural SDF Reconstruction from Underwater Multi-view Monocular Images [63.32490897641344]
We propose a framework for reconstructing target objects from multi-view underwater images based on neural SDF. We introduce hybrid geometric priors to optimize the reconstruction process, markedly enhancing the quality and efficiency of neural SDF reconstruction.
arXiv Detail & Related papers (2024-10-10T16:33:56Z)
Frequency-based View Selection in Gaussian Splatting Reconstruction [9.603843571051744]
We investigate the problem of active view selection to perform 3D Gaussian Splatting reconstructions with as few input images as possible. By ranking the potential views in the frequency domain, we are able to effectively estimate the potential information gain of new viewpoints. Our method achieves state-of-the-art results in view selection, demonstrating its potential for efficient image-based 3D reconstruction.
arXiv Detail & Related papers (2024-09-24T21:44:26Z)
Autonomous Implicit Indoor Scene Reconstruction with Frontier Exploration [10.975244524831696]
Implicit neural representations have demonstrated significant promise for 3D scene reconstruction. Recent works have extended their applications to autonomous implicit reconstruction through the Next Best View (NBV) based method. We propose to incorporate frontier-based exploration tasks for global coverage with implicit surface uncertainty-based reconstruction tasks.
arXiv Detail & Related papers (2024-04-16T01:59:03Z)
3D Reconstruction in Noisy Agricultural Environments: A Bayesian Optimization Perspective for View Planning [30.93026905477516]
view planning (VP) aims to optimally place a certain number of cameras in positions that maximize the visual information. Existing environmental noise can significantly affect the performance of 3D reconstruction. This work puts forth an adaptive Bayesian optimization algorithm for accurate 3D reconstruction in the presence of noise.
arXiv Detail & Related papers (2023-09-29T21:09:02Z)
Improving Neural Indoor Surface Reconstruction with Mask-Guided Adaptive Consistency Constraints [0.6749750044497732]
We propose a two-stage training process, decouple view-dependent and view-independent colors, and leverage two novel consistency constraints to enhance detail reconstruction performance without requiring extra priors. Experiments on synthetic and real-world datasets show the capability of reducing the interference from prior estimation errors.
arXiv Detail & Related papers (2023-09-18T13:05:23Z)
R3D3: Dense 3D Reconstruction of Dynamic Scenes from Multiple Cameras [106.52409577316389]
R3D3 is a multi-camera system for dense 3D reconstruction and ego-motion estimation. Our approach exploits spatial-temporal information from multiple cameras, and monocular depth refinement. We show that this design enables a dense, consistent 3D reconstruction of challenging, dynamic outdoor environments.
arXiv Detail & Related papers (2023-08-28T17:13:49Z)
Learning Reconstructability for Drone Aerial Path Planning [51.736344549907265]
We introduce the first learning-based reconstructability predictor to improve view and path planning for large-scale 3D urban scene acquisition using unmanned drones. In contrast to previous approaches, our method learns a model that explicitly predicts how well a 3D urban scene will be reconstructed from a set of viewpoints.
arXiv Detail & Related papers (2022-09-21T08:10:26Z)
NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction [64.36535692191343]
Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. This paper addresses two key challenges: 1) seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and 2) learning the criterion from data that can generalize to different scenes instead of hand-crafting one. Our method demonstrates significant improvements on various metrics for the rendered image quality and the geometry quality of the reconstructed 3D models when compared with variants using TSDF or reconstruction without view planning.
arXiv Detail & Related papers (2022-07-22T10:05:36Z)
Neural 3D Reconstruction in the Wild [86.6264706256377]
We introduce a new method that enables efficient and accurate surface reconstruction from Internet photo collections. We present a new benchmark and protocol for evaluating reconstruction performance on such in-the-wild scenes.
arXiv Detail & Related papers (2022-05-25T17:59:53Z)
Multi-initialization Optimization Network for Accurate 3D Human Pose and Shape Estimation [75.44912541912252]
We propose a three-stage framework named Multi-Initialization Optimization Network (MION) In the first stage, we strategically select different coarse 3D reconstruction candidates which are compatible with the 2D keypoints of input sample. In the second stage, we design a mesh refinement transformer (MRT) to respectively refine each coarse reconstruction result via a self-attention mechanism. Finally, a Consistency Estimation Network (CEN) is proposed to find the best result from mutiple candidates by evaluating if the visual evidence in RGB image matches a given 3D reconstruction.
arXiv Detail & Related papers (2021-12-24T02:43:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.