Stable View Synthesis
- URL: http://arxiv.org/abs/2011.07233v2
- Date: Sun, 2 May 2021 12:20:20 GMT
- Title: Stable View Synthesis
- Authors: Gernot Riegler, Vladlen Koltun
- Abstract summary: We present Stable View Synthesis (SVS)
Given a set of source images depicting a scene from freely distributed viewpoints, SVS synthesizes new views of the scene.
SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse real-world datasets.
- Score: 100.86844680362196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Stable View Synthesis (SVS). Given a set of source images
depicting a scene from freely distributed viewpoints, SVS synthesizes new views
of the scene. The method operates on a geometric scaffold computed via
structure-from-motion and multi-view stereo. Each point on this 3D scaffold is
associated with view rays and corresponding feature vectors that encode the
appearance of this point in the input images. The core of SVS is view-dependent
on-surface feature aggregation, in which directional feature vectors at each 3D
point are processed to produce a new feature vector for a ray that maps this
point into the new target view. The target view is then rendered by a
convolutional network from a tensor of features synthesized in this way for all
pixels. The method is composed of differentiable modules and is trained
end-to-end. It supports spatially-varying view-dependent importance weighting
and feature transformation of source images at each point; spatial and temporal
stability due to the smooth dependence of on-surface feature aggregation on the
target view; and synthesis of view-dependent effects such as specular
reflection. Experimental results demonstrate that SVS outperforms
state-of-the-art view synthesis methods both quantitatively and qualitatively
on three diverse real-world datasets, achieving unprecedented levels of realism
in free-viewpoint video of challenging large-scale scenes. Code is available at
https://github.com/intel-isl/StableViewSynthesis
Related papers
- FreeSplat: Generalizable 3D Gaussian Splatting Towards Free-View Synthesis of Indoor Scenes [50.534213038479926]
FreeSplat is capable of reconstructing geometrically consistent 3D scenes from long sequence input towards free-view synthesis.
We propose a simple but effective free-view training strategy that ensures robust view synthesis across broader view range regardless of the number of views.
arXiv Detail & Related papers (2024-05-28T08:40:14Z) - CVSformer: Cross-View Synthesis Transformer for Semantic Scene
Completion [0.0]
We propose Cross-View Synthesis Transformer (CVSformer), which consists of Multi-View Feature Synthesis and Cross-View Transformer for learning cross-view object relationships.
We use the enhanced features to predict the geometric occupancies and semantic labels of all voxels.
We evaluate CVSformer on public datasets, where CVSformer yields state-of-the-art results.
arXiv Detail & Related papers (2023-07-16T04:08:03Z) - Enhanced Stable View Synthesis [86.69338893753886]
We introduce an approach to enhance the novel view synthesis from images taken from a freely moving camera.
The introduced approach focuses on outdoor scenes where recovering accurate geometric scaffold and camera pose is challenging.
arXiv Detail & Related papers (2023-03-30T01:53:14Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning
of 3D Pose [10.028521796737314]
We study the problem of learning to estimate the 3D object pose from a few labelled examples and a collection of unlabelled data.
Our main contribution is a learning framework, neural view synthesis and matching, that can transfer the 3D pose annotation from the labelled to unlabelled images reliably.
arXiv Detail & Related papers (2021-10-27T06:53:53Z) - Self-Supervised Visibility Learning for Novel View Synthesis [79.53158728483375]
Conventional rendering methods estimate scene geometry and synthesize novel views in two separate steps.
We propose an end-to-end NVS framework to eliminate the error propagation issue.
Our network is trained in an end-to-end self-supervised fashion, thus significantly alleviating error accumulation in view synthesis.
arXiv Detail & Related papers (2021-03-29T08:11:25Z) - Street-view Panoramic Video Synthesis from a Single Satellite Image [92.26826861266784]
We present a novel method for synthesizing both temporally and geometrically consistent street-view panoramic video.
Existing cross-view synthesis approaches focus more on images, while video synthesis in such a case has not yet received enough attention.
arXiv Detail & Related papers (2020-12-11T20:22:38Z) - AUTO3D: Novel view synthesis through unsupervisely learned variational
viewpoint and global 3D representation [27.163052958878776]
This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision.
We construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation.
Our system can achieve implicitly 3D understanding without explicitly 3D reconstruction.
arXiv Detail & Related papers (2020-07-13T18:51:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.