Generalizable Neural Radiance Fields for Novel View Synthesis with
Transformer
- URL: http://arxiv.org/abs/2206.05375v1
- Date: Fri, 10 Jun 2022 23:16:43 GMT
- Title: Generalizable Neural Radiance Fields for Novel View Synthesis with
Transformer
- Authors: Dan Wang, Xinrui Cui, Septimiu Salcudean, and Z. Jane Wang
- Abstract summary: We propose a Transformer-based NeRF (TransNeRF) to learn a generic neural radiance field conditioned on observed-view images.
Experiments demonstrate that our TransNeRF, trained on a wide variety of scenes, can achieve better performance in comparison to state-of-the-art image-based neural rendering methods.
- Score: 23.228142134527292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a Transformer-based NeRF (TransNeRF) to learn a generic neural
radiance field conditioned on observed-view images for the novel view synthesis
task. By contrast, existing MLP-based NeRFs are not able to directly receive
observed views with an arbitrary number and require an auxiliary pooling-based
operation to fuse source-view information, resulting in the missing of
complicated relationships between source views and the target rendering view.
Furthermore, current approaches process each 3D point individually and ignore
the local consistency of a radiance field scene representation. These
limitations potentially can reduce their performance in challenging real-world
applications where large differences between source views and a novel rendering
view may exist. To address these challenges, our TransNeRF utilizes the
attention mechanism to naturally decode deep associations of an arbitrary
number of source views into a coordinate-based scene representation. Local
consistency of shape and appearance are considered in the ray-cast space and
the surrounding-view space within a unified Transformer network. Experiments
demonstrate that our TransNeRF, trained on a wide variety of scenes, can
achieve better performance in comparison to state-of-the-art image-based neural
rendering methods in both scene-agnostic and per-scene finetuning scenarios
especially when there is a considerable gap between source views and a
rendering view.
Related papers
- CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency [18.101763989542828]
We propose a simple yet effective method that explicitly builds depth-aware consistency across input views.
Our key insight is that by forcing the same spatial points to be sampled repeatedly in different input views, we are able to strengthen the interactions between views.
Although simple, extensive experiments demonstrate that our proposed method can achieve better synthesis quality over state-of-the-art methods.
arXiv Detail & Related papers (2024-02-26T09:04:04Z) - Local Implicit Ray Function for Generalizable Radiance Field
Representation [20.67358742158244]
We propose LIRF (Local Implicit Ray Function), a generalizable neural rendering approach for novel view rendering.
Given 3D positions within conical frustums, LIRF takes 3D coordinates and the features of conical frustums as inputs and predicts a local volumetric radiance field.
Since the coordinates are continuous, LIRF renders high-quality novel views at a continuously-valued scale via volume rendering.
arXiv Detail & Related papers (2023-04-25T11:52:33Z) - IntrinsicNeRF: Learning Intrinsic Neural Radiance Fields for Editable
Novel View Synthesis [90.03590032170169]
We present intrinsic neural radiance fields, dubbed IntrinsicNeRF, which introduce intrinsic decomposition into the NeRF-based neural rendering method.
Our experiments and editing samples on both object-specific/room-scale scenes and synthetic/real-word data demonstrate that we can obtain consistent intrinsic decomposition results.
arXiv Detail & Related papers (2022-10-02T22:45:11Z) - Cascaded and Generalizable Neural Radiance Fields for Fast View
Synthesis [35.035125537722514]
We present CG-NeRF, a cascade and generalizable neural radiance fields method for view synthesis.
We first train CG-NeRF on multiple 3D scenes of the DTU dataset.
We show that CG-NeRF outperforms state-of-the-art generalizable neural rendering methods on various synthetic and real datasets.
arXiv Detail & Related papers (2022-08-09T12:23:48Z) - Is Attention All NeRF Needs? [103.51023982774599]
Generalizable NeRF Transformer (GNT) is a pure, unified transformer-based architecture that efficiently reconstructs Neural Radiance Fields (NeRFs) on the fly from source views.
GNT achieves generalizable neural scene representation and rendering, by encapsulating two transformer-based stages.
arXiv Detail & Related papers (2022-07-27T05:09:54Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [55.70938412352287]
We present an information-theoretic regularization technique for few-shot novel view synthesis based on neural implicit representation.
The proposed approach minimizes potential reconstruction inconsistency that happens due to insufficient viewpoints.
We achieve consistently improved performance compared to existing neural view synthesis methods by large margins on multiple standard benchmarks.
arXiv Detail & Related papers (2021-12-31T11:56:01Z) - MVSNeRF: Fast Generalizable Radiance Field Reconstruction from
Multi-View Stereo [52.329580781898116]
We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct neural radiance fields for view synthesis.
Unlike prior works on neural radiance fields that consider per-scene optimization on densely captured images, we propose a generic deep neural network that can reconstruct radiance fields from only three nearby input views via fast network inference.
arXiv Detail & Related papers (2021-03-29T13:15:23Z) - IBRNet: Learning Multi-View Image-Based Rendering [67.15887251196894]
We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views.
By drawing on source views at render time, our method hearkens back to classic work on image-based rendering.
arXiv Detail & Related papers (2021-02-25T18:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.