GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis
- URL: http://arxiv.org/abs/2505.19813v1
- Date: Mon, 26 May 2025 10:50:25 GMT
- Title: GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis
- Authors: You Wang, Li Fang, Hao Zhu, Fei Hu, Long Ye, Zhan Ma,
- Abstract summary: We propose GoLF-NRT: a Fusion-based Neural Rendering Transformer.<n>GoLF-NRT enhances generalizable neural rendering from few input views.<n>Experiments show that GoLF-NRT achieves state-of-the-art performance across varying numbers of input views.
- Score: 24.10068225852128
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Radiance Fields (NeRF) have transformed novel view synthesis by modeling scene-specific volumetric representations directly from images. While generalizable NeRF models can generate novel views across unknown scenes by learning latent ray representations, their performance heavily depends on a large number of multi-view observations. However, with limited input views, these methods experience significant degradation in rendering quality. To address this limitation, we propose GoLF-NRT: a Global and Local feature Fusion-based Neural Rendering Transformer. GoLF-NRT enhances generalizable neural rendering from few input views by leveraging a 3D transformer with efficient sparse attention to capture global scene context. In parallel, it integrates local geometric features extracted along the epipolar line, enabling high-quality scene reconstruction from as few as 1 to 3 input views. Furthermore, we introduce an adaptive sampling strategy based on attention weights and kernel regression, improving the accuracy of transformer-based neural rendering. Extensive experiments on public datasets show that GoLF-NRT achieves state-of-the-art performance across varying numbers of input views, highlighting the effectiveness and superiority of our approach. Code is available at https://github.com/KLMAV-CUC/GoLF-NRT.
Related papers
- Learning Robust Generalizable Radiance Field with Visibility and Feature
Augmented Point Representation [7.203073346844801]
This paper introduces a novel paradigm for the generalizable neural radiance field (NeRF)
We propose the first paradigm that constructs the generalizable neural field based on point-based rather than image-based rendering.
Our approach explicitly models visibilities by geometric priors and augments them with neural features.
arXiv Detail & Related papers (2024-01-25T17:58:51Z) - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object
Structure via HyperNetworks [53.67497327319569]
We introduce a novel neural rendering technique to solve image-to-3D from a single view.
Our approach employs the signed distance function as the surface representation and incorporates generalizable priors through geometry-encoding volumes and HyperNetworks.
Our experiments show the advantages of our proposed approach with consistent results and rapid generation.
arXiv Detail & Related papers (2023-12-24T08:42:37Z) - PNeRFLoc: Visual Localization with Point-based Neural Radiance Fields [54.8553158441296]
We propose a novel visual localization framework, ie, PNeRFLoc, based on a unified point-based representation.
On the one hand, PNeRFLoc supports the initial pose estimation by matching 2D and 3D feature points.
On the other hand, it also enables pose refinement with novel view synthesis using rendering-based optimization.
arXiv Detail & Related papers (2023-12-17T08:30:00Z) - Learning Neural Duplex Radiance Fields for Real-Time View Synthesis [33.54507228895688]
We propose a novel approach to distill and bake NeRFs into highly efficient mesh-based neural representations.
We demonstrate the effectiveness and superiority of our approach via extensive experiments on a range of standard datasets.
arXiv Detail & Related papers (2023-04-20T17:59:52Z) - VolRecon: Volume Rendering of Signed Ray Distance Functions for
Generalizable Multi-View Reconstruction [64.09702079593372]
VolRecon is a novel generalizable implicit reconstruction method with Signed Ray Distance Function (SRDF)
On DTU dataset, VolRecon outperforms SparseNeuS by about 30% in sparse view reconstruction and achieves comparable accuracy as MVSNet in full view reconstruction.
arXiv Detail & Related papers (2022-12-15T18:59:54Z) - Is Attention All NeRF Needs? [103.51023982774599]
Generalizable NeRF Transformer (GNT) is a pure, unified transformer-based architecture that efficiently reconstructs Neural Radiance Fields (NeRFs) on the fly from source views.
GNT achieves generalizable neural scene representation and rendering, by encapsulating two transformer-based stages.
arXiv Detail & Related papers (2022-07-27T05:09:54Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Generalizable Neural Radiance Fields for Novel View Synthesis with
Transformer [23.228142134527292]
We propose a Transformer-based NeRF (TransNeRF) to learn a generic neural radiance field conditioned on observed-view images.
Experiments demonstrate that our TransNeRF, trained on a wide variety of scenes, can achieve better performance in comparison to state-of-the-art image-based neural rendering methods.
arXiv Detail & Related papers (2022-06-10T23:16:43Z) - MVSNeRF: Fast Generalizable Radiance Field Reconstruction from
Multi-View Stereo [52.329580781898116]
We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct neural radiance fields for view synthesis.
Unlike prior works on neural radiance fields that consider per-scene optimization on densely captured images, we propose a generic deep neural network that can reconstruct radiance fields from only three nearby input views via fast network inference.
arXiv Detail & Related papers (2021-03-29T13:15:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.