Scene Representation Transformer: Geometry-Free Novel View Synthesis
Through Set-Latent Scene Representations
- URL: http://arxiv.org/abs/2111.13152v2
- Date: Mon, 29 Nov 2021 09:54:01 GMT
- Title: Scene Representation Transformer: Geometry-Free Novel View Synthesis
Through Set-Latent Scene Representations
- Authors: Mehdi S. M. Sajjadi and Henning Meyer and Etienne Pot and Urs Bergmann
and Klaus Greff and Noha Radwan and Suhani Vora and Mario Lucic and Daniel
Duckworth and Alexey Dosovitskiy and Jakob Uszkoreit and Thomas Funkhouser
and Andrea Tagliasacchi
- Abstract summary: A classical problem in computer vision is to infer a 3D scene representation from few images that can be used to render novel views at interactive rates.
We propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area.
We show that this method outperforms recent baselines in terms of PSNR and speed on synthetic datasets.
- Score: 48.05445941939446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A classical problem in computer vision is to infer a 3D scene representation
from few images that can be used to render novel views at interactive rates.
Previous work focuses on reconstructing pre-defined 3D representations, e.g.
textured meshes, or implicit representations, e.g. radiance fields, and often
requires input images with precise camera poses and long processing times for
each novel scene.
In this work, we propose the Scene Representation Transformer (SRT), a method
which processes posed or unposed RGB images of a new area, infers a "set-latent
scene representation", and synthesises novel views, all in a single
feed-forward pass. To calculate the scene representation, we propose a
generalization of the Vision Transformer to sets of images, enabling global
information integration, and hence 3D reasoning. An efficient decoder
transformer parameterizes the light field by attending into the scene
representation to render novel views. Learning is supervised end-to-end by
minimizing a novel-view reconstruction error.
We show that this method outperforms recent baselines in terms of PSNR and
speed on synthetic datasets, including a new dataset created for the paper.
Further, we demonstrate that SRT scales to support interactive visualization
and semantic segmentation of real-world outdoor environments using Street View
imagery.
Related papers
- ReShader: View-Dependent Highlights for Single Image View-Synthesis [5.736642774848791]
We propose to split the view synthesis process into two independent tasks of pixel reshading and relocation.
During the reshading process, we take the single image as the input and adjust its shading based on the novel camera.
This reshaded image is then used as the input to an existing view synthesis method to relocate the pixels and produce the final novel view image.
arXiv Detail & Related papers (2023-09-19T15:23:52Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - Learning to Render Novel Views from Wide-Baseline Stereo Pairs [26.528667940013598]
We introduce a method for novel view synthesis given only a single wide-baseline stereo image pair.
Existing approaches to novel view synthesis from sparse observations fail due to recovering incorrect 3D geometry.
We propose an efficient, image-space epipolar line sampling scheme to assemble image features for a target ray.
arXiv Detail & Related papers (2023-04-17T17:40:52Z) - Neural Radiance Transfer Fields for Relightable Novel-view Synthesis
with Global Illumination [63.992213016011235]
We propose a method for scene relighting under novel views by learning a neural precomputed radiance transfer function.
Our method can be solely supervised on a set of real images of the scene under a single unknown lighting condition.
Results show that the recovered disentanglement of scene parameters improves significantly over the current state of the art.
arXiv Detail & Related papers (2022-07-27T16:07:48Z) - Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Remote Sensing Novel View Synthesis with Implicit Multiplane
Representations [26.33490094119609]
We propose a novel remote sensing view synthesis method by leveraging the recent advances in implicit neural representations.
Considering the overhead and far depth imaging of remote sensing images, we represent the 3D space by combining implicit multiplane images (MPI) representation and deep neural networks.
Images from any novel views can be freely rendered on the basis of the reconstructed model.
arXiv Detail & Related papers (2022-05-18T13:03:55Z) - IBRNet: Learning Multi-View Image-Based Rendering [67.15887251196894]
We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views.
By drawing on source views at render time, our method hearkens back to classic work on image-based rendering.
arXiv Detail & Related papers (2021-02-25T18:56:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.