IBRNet: Learning Multi-View Image-Based Rendering
- URL: http://arxiv.org/abs/2102.13090v1
- Date: Thu, 25 Feb 2021 18:56:21 GMT
- Title: IBRNet: Learning Multi-View Image-Based Rendering
- Authors: Qianqian Wang, Zhicheng Wang, Kyle Genova, Pratul Srinivasan, Howard
Zhou, Jonathan T. Barron, Ricardo Martin-Brualla, Noah Snavely, Thomas
Funkhouser
- Abstract summary: We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views.
By drawing on source views at render time, our method hearkens back to classic work on image-based rendering.
- Score: 67.15887251196894
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a method that synthesizes novel views of complex scenes by
interpolating a sparse set of nearby views. The core of our method is a network
architecture that includes a multilayer perceptron and a ray transformer that
estimates radiance and volume density at continuous 5D locations (3D spatial
locations and 2D viewing directions), drawing appearance information on the fly
from multiple source views. By drawing on source views at render time, our
method hearkens back to classic work on image-based rendering (IBR), and allows
us to render high-resolution imagery. Unlike neural scene representation work
that optimizes per-scene functions for rendering, we learn a generic view
interpolation function that generalizes to novel scenes. We render images using
classic volume rendering, which is fully differentiable and allows us to train
using only multi-view posed images as supervision. Experiments show that our
method outperforms recent novel view synthesis methods that also seek to
generalize to novel scenes. Further, if fine-tuned on each scene, our method is
competitive with state-of-the-art single-scene neural rendering methods.
Related papers
- Learning to Render Novel Views from Wide-Baseline Stereo Pairs [26.528667940013598]
We introduce a method for novel view synthesis given only a single wide-baseline stereo image pair.
Existing approaches to novel view synthesis from sparse observations fail due to recovering incorrect 3D geometry.
We propose an efficient, image-space epipolar line sampling scheme to assemble image features for a target ray.
arXiv Detail & Related papers (2023-04-17T17:40:52Z) - MAIR: Multi-view Attention Inverse Rendering with 3D Spatially-Varying
Lighting Estimation [13.325800282424598]
We propose a scene-level inverse rendering framework that uses multi-view images to decompose the scene into geometry, a SVBRDF, and 3D spatially-varying lighting.
Our experiments show that the proposed method achieves better performance than single-view-based methods, but also achieves robust performance on unseen real-world scene.
arXiv Detail & Related papers (2023-03-22T08:07:28Z) - Multi-Plane Neural Radiance Fields for Novel View Synthesis [5.478764356647437]
Novel view synthesis is a long-standing problem that revolves around rendering frames of scenes from novel camera viewpoints.
In this work, we examine the performance, generalization, and efficiency of single-view multi-plane neural radiance fields.
We propose a new multiplane NeRF architecture that accepts multiple views to improve the synthesis results and expand the viewing range.
arXiv Detail & Related papers (2023-03-03T06:32:55Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Scene Representation Transformer: Geometry-Free Novel View Synthesis
Through Set-Latent Scene Representations [48.05445941939446]
A classical problem in computer vision is to infer a 3D scene representation from few images that can be used to render novel views at interactive rates.
We propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area.
We show that this method outperforms recent baselines in terms of PSNR and speed on synthetic datasets.
arXiv Detail & Related papers (2021-11-25T16:18:56Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - Neural Radiance Flow for 4D View Synthesis and Video Processing [59.9116932930108]
We present a method to learn a 4D spatial-temporal representation of a dynamic scene from a set of RGB images.
Key to our approach is the use of a neural implicit representation that learns to capture the 3D occupancy, radiance, and dynamics of the scene.
arXiv Detail & Related papers (2020-12-17T17:54:32Z) - NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [78.5281048849446]
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes.
Our algorithm represents a scene using a fully-connected (non-convolutional) deep network.
Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses.
arXiv Detail & Related papers (2020-03-19T17:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.