SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification
- URL: http://arxiv.org/abs/2502.02657v3
- Date: Wed, 08 Oct 2025 20:23:59 GMT
- Title: SiLVR: Scalable Lidar-Visual Radiance Field Reconstruction with Uncertainty Quantification
- Authors: Yifu Tao, Maurice Fallon,
- Abstract summary: We present a neural radiance field (NeRF) based large-scale reconstruction system that fuses lidar and vision data.<n>Our system adopts the state-of-the-art NeRF representation to incorporate lidar.<n>Adding lidar data adds strong geometric constraints on the depth and surface normals.
- Score: 0.6445605125467574
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present a neural radiance field (NeRF) based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions that are geometrically accurate and capture photorealistic texture. Our system adopts the state-of-the-art NeRF representation to incorporate lidar. Adding lidar data adds strong geometric constraints on the depth and surface normals, which is particularly useful when modelling uniform texture surfaces which contain ambiguous visual reconstruction cues. A key contribution of this work is a novel method to quantify the epistemic uncertainty of the lidar-visual NeRF reconstruction by estimating the spatial variance of each point location in the radiance field given the sensor observations from the cameras and lidar. This provides a principled approach to evaluate the contribution of each sensor modality to the final reconstruction. In this way, reconstructions that are uncertain (due to e.g. uniform visual texture, limited observation viewpoints, or little lidar coverage) can be identified and removed. Our system is integrated with a real-time lidar SLAM system which is used to bootstrap a Structure-from-Motion (SfM) reconstruction procedure. It also helps to properly constrain the overall metric scale which is essential for the lidar depth loss. The refined SLAM trajectory can then be divided into submaps using Spectral Clustering to group sets of co-visible images together. This submapping approach is more suitable for visual reconstruction than distance-based partitioning. Our uncertainty estimation is particularly effective when merging submaps as their boundaries often contain artefacts due to limited observations. We demonstrate the reconstruction system using a multi-camera, lidar sensor suite in experiments involving both robot-mounted and handheld scanning. Our test datasets cover a total area of more than 20,000 square metres.
Related papers
- GCRayDiffusion: Pose-Free Surface Reconstruction via Geometric Consistent Ray Diffusion [30.773599974914415]
Previous approaches have achieved impressive pose-free surface reconstruction results in dense-view settings.
We propose a new technique for pose-free surface reconstruction, which regularizes the learning by explicit points sampled from ray-based diffusion of camera pose estimation.
Our GCRayDiffusion achieves more accurate camera pose estimation than previous approaches, with geometrically more consistent surface reconstruction results.
arXiv Detail & Related papers (2025-03-28T11:45:09Z) - Decompositional Neural Scene Reconstruction with Generative Diffusion Prior [64.71091831762214]
Decompositional reconstruction of 3D scenes, with complete shapes and detailed texture, is intriguing for downstream applications.
Recent approaches incorporate semantic or geometric regularization to address this issue, but they suffer significant degradation in underconstrained areas.
We propose DP-Recon, which employs diffusion priors in the form of Score Distillation Sampling (SDS) to optimize the neural representation of each individual object under novel views.
arXiv Detail & Related papers (2025-03-19T02:11:31Z) - PINGS: Gaussian Splatting Meets Distance Fields within a Point-Based Implicit Neural Map [30.06864329412246]
We propose a novel map representation that unifies a continuous signed distance field and a Gaussian splatting radiance field within an elastic and compact point-based implicit neural map.
We devise a LiDAR-visual SLAM system called PINGS using the proposed map representation and evaluate it on several challenging large-scale datasets.
arXiv Detail & Related papers (2025-02-09T03:06:19Z) - CoCPF: Coordinate-based Continuous Projection Field for Ill-Posed Inverse Problem in Imaging [78.734927709231]
Sparse-view computed tomography (SVCT) reconstruction aims to acquire CT images based on sparsely-sampled measurements.
Due to ill-posedness, implicit neural representation (INR) techniques may leave considerable holes'' (i.e., unmodeled spaces) in their fields, leading to sub-optimal results.
We propose the Coordinate-based Continuous Projection Field (CoCPF), which aims to build hole-free representation fields for SVCT reconstruction.
arXiv Detail & Related papers (2024-06-21T08:38:30Z) - SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields
for Robotic Inspection [4.6102302191645075]
We present a neural-field-based large-scale reconstruction system that fuses lidar and vision data to generate high-quality reconstructions.
We exploit the trajectory from a real-time lidar SLAM system to bootstrap a Structure-from-Motion (SfM) procedure.
We use submapping to scale the system to large-scale environments captured over long trajectories.
arXiv Detail & Related papers (2024-03-11T16:31:25Z) - NeRF-VO: Real-Time Sparse Visual Odometry with Neural Radiance Fields [13.178099653374945]
NeRF-VO integrates learning-based sparse visual odometry for low-latency camera tracking and a neural radiance scene representation.
We surpass SOTA methods in pose estimation accuracy, novel view fidelity, and dense reconstruction quality across a variety of synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-20T22:42:17Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Fast Monocular Scene Reconstruction with Global-Sparse Local-Dense Grids [84.90863397388776]
We propose to directly use signed distance function (SDF) in sparse voxel block grids for fast and accurate scene reconstruction without distances.
Our globally sparse and locally dense data structure exploits surfaces' spatial sparsity, enables cache-friendly queries, and allows direct extensions to multi-modal data.
Experiments show that our approach is 10x faster in training and 100x faster in rendering while achieving comparable accuracy to state-of-the-art neural implicit methods.
arXiv Detail & Related papers (2023-05-22T16:50:19Z) - BS3D: Building-scale 3D Reconstruction from RGB-D Images [25.604775584883413]
We propose an easy-to-use framework for acquiring building-scale 3D reconstruction using a consumer depth camera.
Unlike complex and expensive acquisition setups, our system enables crowd-sourcing, which can greatly benefit data-hungry algorithms.
arXiv Detail & Related papers (2023-01-03T11:46:14Z) - Near-filed SAR Image Restoration with Deep Learning Inverse Technique: A
Preliminary Study [5.489791364472879]
Near-field synthetic aperture radar (SAR) provides a high-resolution image of a target's scattering distribution-hot spots.
Meanwhile, imaging result suffers inevitable degradation from sidelobes, clutters, and noises.
To restore the image, current methods make simplified assumptions; for example, the point spread function (PSF) is spatially consistent, the target consists of sparse point scatters, etc.
We reformulate the degradation model into a spatially variable complex-convolution model, where the near-field SAR's system response is considered.
A model-based deep learning network is designed to restore the
arXiv Detail & Related papers (2022-11-28T01:28:33Z) - Few-shot Non-line-of-sight Imaging with Signal-surface Collaborative
Regularization [18.466941045530408]
Non-line-of-sight imaging technique aims to reconstruct targets from multiply reflected light.
We propose a signal-surface collaborative regularization framework that provides noise-robust reconstructions with a minimal number of measurements.
Our approach has great potential in real-time non-line-of-sight imaging applications such as rescue operations and autonomous driving.
arXiv Detail & Related papers (2022-11-21T11:19:20Z) - Semi-signed neural fitting for surface reconstruction from unoriented
point clouds [53.379712818791894]
We propose SSN-Fitting to reconstruct a better signed distance field.
SSN-Fitting consists of a semi-signed supervision and a loss-based region sampling strategy.
We conduct experiments to demonstrate that SSN-Fitting achieves state-of-the-art performance under different settings.
arXiv Detail & Related papers (2022-06-14T09:40:17Z) - TerrainMesh: Metric-Semantic Terrain Reconstruction from Aerial Images
Using Joint 2D-3D Learning [20.81202315793742]
This paper develops a joint 2D-3D learning approach to reconstruct a local metric-semantic mesh at each camera maintained by a visual odometry algorithm.
The mesh can be assembled into a global environment model to capture the terrain topology and semantics during online operation.
arXiv Detail & Related papers (2022-04-23T05:18:39Z) - Light Field Reconstruction Using Convolutional Network on EPI and
Extended Applications [78.63280020581662]
A novel convolutional neural network (CNN)-based framework is developed for light field reconstruction from a sparse set of views.
We demonstrate the high performance and robustness of the proposed framework compared with state-of-the-art algorithms.
arXiv Detail & Related papers (2021-03-24T08:16:32Z) - 3D Surface Reconstruction From Multi-Date Satellite Images [11.84274417463238]
We propose an extension of Structure from Motion (SfM) based pipeline that allows us to reconstruct point clouds from multiple satellite images.
We provide a detailed description of several steps that are mandatory to exploit state-of-the-art mesh reconstruction algorithms in the context of satellite imagery.
We show that the proposed pipeline combined with current meshing algorithms outperforms state-of-the-art point cloud reconstruction algorithms in terms of completeness and median error.
arXiv Detail & Related papers (2021-02-04T09:23:21Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z) - OmniSLAM: Omnidirectional Localization and Dense Mapping for
Wide-baseline Multi-camera Systems [88.41004332322788]
We present an omnidirectional localization and dense mapping system for a wide-baseline multiview stereo setup with ultra-wide field-of-view (FOV) fisheye cameras.
For more practical and accurate reconstruction, we first introduce improved and light-weighted deep neural networks for the omnidirectional depth estimation.
We integrate our omnidirectional depth estimates into the visual odometry (VO) and add a loop closing module for global consistency.
arXiv Detail & Related papers (2020-03-18T05:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.