Multi-Plane Neural Radiance Fields for Novel View Synthesis
- URL: http://arxiv.org/abs/2303.01736v1
- Date: Fri, 3 Mar 2023 06:32:55 GMT
- Title: Multi-Plane Neural Radiance Fields for Novel View Synthesis
- Authors: Youssef Abdelkareem, Shady Shehata, Fakhri Karray
- Abstract summary: Novel view synthesis is a long-standing problem that revolves around rendering frames of scenes from novel camera viewpoints.
In this work, we examine the performance, generalization, and efficiency of single-view multi-plane neural radiance fields.
We propose a new multiplane NeRF architecture that accepts multiple views to improve the synthesis results and expand the viewing range.
- Score: 5.478764356647437
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Novel view synthesis is a long-standing problem that revolves around
rendering frames of scenes from novel camera viewpoints. Volumetric approaches
provide a solution for modeling occlusions through the explicit 3D
representation of the camera frustum. Multi-plane Images (MPI) are volumetric
methods that represent the scene using front-parallel planes at distinct depths
but suffer from depth discretization leading to a 2.D scene representation.
Another line of approach relies on implicit 3D scene representations. Neural
Radiance Fields (NeRF) utilize neural networks for encapsulating the continuous
3D scene structure within the network weights achieving photorealistic
synthesis results, however, methods are constrained to per-scene optimization
settings which are inefficient in practice. Multi-plane Neural Radiance Fields
(MINE) open the door for combining implicit and explicit scene representations.
It enables continuous 3D scene representations, especially in the depth
dimension, while utilizing the input image features to avoid per-scene
optimization. The main drawback of the current literature work in this domain
is being constrained to single-view input, limiting the synthesis ability to
narrow viewpoint ranges. In this work, we thoroughly examine the performance,
generalization, and efficiency of single-view multi-plane neural radiance
fields. In addition, we propose a new multiplane NeRF architecture that accepts
multiple views to improve the synthesis results and expand the viewing range.
Features from the input source frames are effectively fused through a proposed
attention-aware fusion module to highlight important information from different
viewpoints. Experiments show the effectiveness of attention-based fusion and
the promising outcomes of our proposed method when compared to multi-view NeRF
and MPI techniques.
Related papers
- MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis.
We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images.
Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z) - GenLayNeRF: Generalizable Layered Representations with 3D Model
Alignment for Multi-Human View Synthesis [1.6574413179773757]
GenLayNeRF is a generalizable layered scene representation for free-viewpoint rendering of multiple human subjects.
We divide the scene into multi-human layers anchored by the 3D body meshes.
We extract point-wise image-aligned and human-anchored features which are correlated and fused.
arXiv Detail & Related papers (2023-09-20T20:37:31Z) - Learning to Render Novel Views from Wide-Baseline Stereo Pairs [26.528667940013598]
We introduce a method for novel view synthesis given only a single wide-baseline stereo image pair.
Existing approaches to novel view synthesis from sparse observations fail due to recovering incorrect 3D geometry.
We propose an efficient, image-space epipolar line sampling scheme to assemble image features for a target ray.
arXiv Detail & Related papers (2023-04-17T17:40:52Z) - Efficient View Synthesis and 3D-based Multi-Frame Denoising with
Multiplane Feature Representations [1.18885605647513]
We introduce the first 3D-based multi-frame denoising method that significantly outperforms its 2D-based counterparts with lower computational requirements.
Our method extends the multiplane image (MPI) framework for novel view synthesis by introducing a learnable encoder-renderer pair manipulating multiplane in feature space.
arXiv Detail & Related papers (2023-03-31T15:23:35Z) - Neural Volume Super-Resolution [49.879789224455436]
We propose a neural super-resolution network that operates directly on the volumetric representation of the scene.
To realize our method, we devise a novel 3D representation that hinges on multiple 2D feature planes.
We validate the proposed method by super-resolving multi-view consistent views on a diverse set of unseen 3D scenes.
arXiv Detail & Related papers (2022-12-09T04:54:13Z) - CLONeR: Camera-Lidar Fusion for Occupancy Grid-aided Neural
Representations [77.90883737693325]
This paper proposes CLONeR, which significantly improves upon NeRF by allowing it to model large outdoor driving scenes observed from sparse input sensor views.
This is achieved by decoupling occupancy and color learning within the NeRF framework into separate Multi-Layer Perceptrons (MLPs) trained using LiDAR and camera data, respectively.
In addition, this paper proposes a novel method to build differentiable 3D Occupancy Grid Maps (OGM) alongside the NeRF model, and leverage this occupancy grid for improved sampling of points along a ray for rendering in metric space.
arXiv Detail & Related papers (2022-09-02T17:44:50Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - MVSNeRF: Fast Generalizable Radiance Field Reconstruction from
Multi-View Stereo [52.329580781898116]
We present MVSNeRF, a novel neural rendering approach that can efficiently reconstruct neural radiance fields for view synthesis.
Unlike prior works on neural radiance fields that consider per-scene optimization on densely captured images, we propose a generic deep neural network that can reconstruct radiance fields from only three nearby input views via fast network inference.
arXiv Detail & Related papers (2021-03-29T13:15:23Z) - NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [78.5281048849446]
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes.
Our algorithm represents a scene using a fully-connected (non-convolutional) deep network.
Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses.
arXiv Detail & Related papers (2020-03-19T17:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.