VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis
- URL: http://arxiv.org/abs/2311.05289v2
- Date: Wed, 04 Dec 2024 18:32:57 GMT
- Title: VoxNeRF: Bridging Voxel Representation and Neural Radiance Fields for Enhanced Indoor View Synthesis
- Authors: Sen Wang, Qing Cheng, Stefano Gasperini, Wei Zhang, Shun-Cheng Wu, Niclas Zeller, Daniel Cremers, Nassir Navab,
- Abstract summary: VoxNeRF is a novel approach to enhance the quality and efficiency of neural indoor reconstruction and novel view synthesis.<n>We propose an efficient voxel-guided sampling technique that allocates computational resources to selectively the most relevant segments of rays.<n>Our approach is validated with extensive experiments on ScanNet and ScanNet++.
- Score: 73.50359502037232
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The generation of high-fidelity view synthesis is essential for robotic navigation and interaction but remains challenging, particularly in indoor environments and real-time scenarios. Existing techniques often require significant computational resources for both training and rendering, and they frequently result in suboptimal 3D representations due to insufficient geometric structuring. To address these limitations, we introduce VoxNeRF, a novel approach that utilizes easy-to-obtain geometry priors to enhance both the quality and efficiency of neural indoor reconstruction and novel view synthesis. We propose an efficient voxel-guided sampling technique that allocates computational resources selectively to the most relevant segments of rays based on a voxel-encoded geometry prior, significantly reducing training and rendering time. Additionally, we incorporate a robust depth loss to improve reconstruction and rendering quality in sparse view settings. Our approach is validated with extensive experiments on ScanNet and ScanNet++ where VoxNeRF outperforms existing state-of-the-art methods and establishes a new benchmark for indoor immersive interpolation and extrapolation settings.
Related papers
- A self-supervised cyclic neural-analytic approach for novel view synthesis and 3D reconstruction [11.558827428811385]
We propose a self-supervised cyclic neural-analytic pipeline that combines high-quality neural rendering outputs with precise geometric insights from analytical methods.
Our solution improves RGB and mesh reconstructions for novel view synthesis, especially in undersampled areas and regions that are completely different from the training dataset.
Our findings demonstrate substantial improvements in rendering views of novel and also 3D reconstruction, which to the best of our knowledge is a first.
arXiv Detail & Related papers (2025-03-05T14:28:01Z) - UniVoxel: Fast Inverse Rendering by Unified Voxelization of Scene Representation [66.95976870627064]
We design a Unified Voxelization framework for explicit learning of scene representations, dubbed UniVoxel.
We propose to encode a scene into a latent volumetric representation, based on which the geometry, materials and illumination can be readily learned via lightweight neural networks.
Experiments show that UniVoxel boosts the optimization efficiency significantly compared to other methods, reducing the per-scene training time from hours to 18 minutes, while achieving favorable reconstruction quality.
arXiv Detail & Related papers (2024-07-28T17:24:14Z) - PNeRFLoc: Visual Localization with Point-based Neural Radiance Fields [54.8553158441296]
We propose a novel visual localization framework, ie, PNeRFLoc, based on a unified point-based representation.
On the one hand, PNeRFLoc supports the initial pose estimation by matching 2D and 3D feature points.
On the other hand, it also enables pose refinement with novel view synthesis using rendering-based optimization.
arXiv Detail & Related papers (2023-12-17T08:30:00Z) - VQ-NeRF: Vector Quantization Enhances Implicit Neural Representations [25.88881764546414]
VQ-NeRF is an efficient pipeline for enhancing implicit neural representations via vector quantization.
We present an innovative multi-scale NeRF sampling scheme that concurrently optimize the NeRF model at both compressed and original scales.
We incorporate a semantic loss function to improve the geometric fidelity and semantic coherence of our 3D reconstructions.
arXiv Detail & Related papers (2023-10-23T01:41:38Z) - Adaptive Multi-NeRF: Exploit Efficient Parallelism in Adaptive Multiple
Scale Neural Radiance Field Rendering [3.8200916793910973]
Recent advances in Neural Radiance Fields (NeRF) have demonstrated significant potential for representing 3D scene appearances as implicit neural networks.
However, the lengthy training and rendering process hinders the widespread adoption of this promising technique for real-time rendering applications.
We present an effective adaptive multi-NeRF method designed to accelerate the neural rendering process for large scenes.
arXiv Detail & Related papers (2023-10-03T08:34:49Z) - Learning Neural Duplex Radiance Fields for Real-Time View Synthesis [33.54507228895688]
We propose a novel approach to distill and bake NeRFs into highly efficient mesh-based neural representations.
We demonstrate the effectiveness and superiority of our approach via extensive experiments on a range of standard datasets.
arXiv Detail & Related papers (2023-04-20T17:59:52Z) - SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic
Reconstruction of Indoor Scenes [17.711755550841385]
SLAM-based methods can reconstruct 3D scene geometry progressively in real time but can not render photorealistic results.
NeRF-based methods produce promising novel view synthesis results, their long offline optimization time and lack of geometric constraints pose challenges to efficiently handling online input.
We introduce SurfelNeRF, a variant of neural radiance field which employs a flexible and scalable neural surfel representation to store geometric attributes and extracted appearance features from input images.
arXiv Detail & Related papers (2023-04-18T13:11:49Z) - Grid-guided Neural Radiance Fields for Large Urban Scenes [146.06368329445857]
Recent approaches propose to geographically divide the scene and adopt multiple sub-NeRFs to model each region individually.
An alternative solution is to use a feature grid representation, which is computationally efficient and can naturally scale to a large scene.
We present a new framework that realizes high-fidelity rendering on large urban scenes while being computationally efficient.
arXiv Detail & Related papers (2023-03-24T13:56:45Z) - Fast Dynamic Radiance Fields with Time-Aware Neural Voxels [106.69049089979433]
We propose a radiance field framework by representing scenes with time-aware voxel features, named as TiNeuVox.
Our framework accelerates the optimization of dynamic radiance fields while maintaining high rendering quality.
Our TiNeuVox completes training with only 8 minutes and 8-MB storage cost while showing similar or even better rendering performance than previous dynamic NeRF methods.
arXiv Detail & Related papers (2022-05-30T17:47:31Z) - Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural
Human Rendering [139.159534903657]
We develop a generalizable and efficient Neural Radiance Field (NeRF) pipeline for high-fidelity free-viewpoint human body details.
To better tackle self-occlusion, we devise a geometry-guided multi-view feature integration approach.
For achieving higher rendering efficiency, we introduce a geometry-guided progressive rendering pipeline.
arXiv Detail & Related papers (2021-12-08T14:42:10Z) - Direct Voxel Grid Optimization: Super-fast Convergence for Radiance
Fields Reconstruction [42.3230709881297]
We present a super-fast convergence approach to reconstructing the per-scene radiance field from a set of images.
Our approach achieves NeRF-comparable quality and converges rapidly from scratch in less than 15 minutes with a single GPU.
arXiv Detail & Related papers (2021-11-22T14:02:07Z) - NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor
Multi-view Stereo [97.07453889070574]
We present a new multi-view depth estimation method that utilizes both conventional SfM reconstruction and learning-based priors.
We show that our proposed framework significantly outperforms state-of-the-art methods on indoor scenes.
arXiv Detail & Related papers (2021-09-02T17:54:31Z) - MVSNeRF: Fast Generalizable Radiance Field Reconstruction from
Multi-View Stereo [52.329580781898116]
We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct neural radiance fields for view synthesis.
Unlike prior works on neural radiance fields that consider per-scene optimization on densely captured images, we propose a generic deep neural network that can reconstruct radiance fields from only three nearby input views via fast network inference.
arXiv Detail & Related papers (2021-03-29T13:15:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.