Learning to Render Novel Views from Wide-Baseline Stereo Pairs
- URL: http://arxiv.org/abs/2304.08463v1
- Date: Mon, 17 Apr 2023 17:40:52 GMT
- Title: Learning to Render Novel Views from Wide-Baseline Stereo Pairs
- Authors: Yilun Du, Cameron Smith, Ayush Tewari, Vincent Sitzmann
- Abstract summary: We introduce a method for novel view synthesis given only a single wide-baseline stereo image pair.
Existing approaches to novel view synthesis from sparse observations fail due to recovering incorrect 3D geometry.
We propose an efficient, image-space epipolar line sampling scheme to assemble image features for a target ray.
- Score: 26.528667940013598
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: We introduce a method for novel view synthesis given only a single
wide-baseline stereo image pair. In this challenging regime, 3D scene points
are regularly observed only once, requiring prior-based reconstruction of scene
geometry and appearance. We find that existing approaches to novel view
synthesis from sparse observations fail due to recovering incorrect 3D geometry
and due to the high cost of differentiable rendering that precludes their
scaling to large-scale training. We take a step towards resolving these
shortcomings by formulating a multi-view transformer encoder, proposing an
efficient, image-space epipolar line sampling scheme to assemble image features
for a target ray, and a lightweight cross-attention-based renderer. Our
contributions enable training of our method on a large-scale real-world dataset
of indoor and outdoor scenes. We demonstrate that our method learns powerful
multi-view geometry priors while reducing the rendering time. We conduct
extensive comparisons on held-out test scenes across two real-world datasets,
significantly outperforming prior work on novel view synthesis from sparse
image observations and achieving multi-view-consistent novel view synthesis.
Related papers
- FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes [50.534213038479926]
FreeSplat is capable of reconstructing geometrically consistent 3D scenes from long sequence input towards free-view synthesis.
We propose a simple but effective free-view training strategy that ensures robust view synthesis across broader view range regardless of the number of views.
arXiv Detail & Related papers (2024-05-28T08:40:14Z) - G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images [45.66479596827045]
We propose a Geometry-enhanced NeRF (G-NeRF), which seeks to enhance the geometry priors by a geometry-guided multi-view synthesis approach.
To tackle the absence of multi-view supervision for single-view images, we design the depth-aware training approach.
arXiv Detail & Related papers (2024-04-11T04:58:18Z) - Multi-Plane Neural Radiance Fields for Novel View Synthesis [5.478764356647437]
Novel view synthesis is a long-standing problem that revolves around rendering frames of scenes from novel camera viewpoints.
In this work, we examine the performance, generalization, and efficiency of single-view multi-plane neural radiance fields.
We propose a new multiplane NeRF architecture that accepts multiple views to improve the synthesis results and expand the viewing range.
arXiv Detail & Related papers (2023-03-03T06:32:55Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Extracting Triangular 3D Models, Materials, and Lighting From Images [59.33666140713829]
We present an efficient method for joint optimization of materials and lighting from multi-view image observations.
We leverage meshes with spatially-varying materials and environment that can be deployed in any traditional graphics engine.
arXiv Detail & Related papers (2021-11-24T13:58:20Z) - IBRNet: Learning Multi-View Image-Based Rendering [67.15887251196894]
We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views.
By drawing on source views at render time, our method hearkens back to classic work on image-based rendering.
arXiv Detail & Related papers (2021-02-25T18:56:21Z) - Deep 3D Capture: Geometry and Reflectance from Sparse Multi-View Images [59.906948203578544]
We introduce a novel learning-based method to reconstruct the high-quality geometry and complex, spatially-varying BRDF of an arbitrary object.
We first estimate per-view depth maps using a deep multi-view stereo network.
These depth maps are used to coarsely align the different views.
We propose a novel multi-view reflectance estimation network architecture.
arXiv Detail & Related papers (2020-03-27T21:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.