Generative View Synthesis: From Single-view Semantics to Novel-view
Images
- URL: http://arxiv.org/abs/2008.09106v2
- Date: Fri, 2 Oct 2020 12:09:09 GMT
- Title: Generative View Synthesis: From Single-view Semantics to Novel-view
Images
- Authors: Tewodros Habtegebrial, Varun Jampani, Orazio Gallo, Didier Stricker
- Abstract summary: Generative View Synthesis (GVS) can synthesize multiple photorealistic views of a scene given a single semantic map.
We first lift the input 2D semantic map onto a 3D layered representation of the scene in feature space.
We then project the layered features onto the target views to generate the final novel-view images.
- Score: 38.7873192939574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Content creation, central to applications such as virtual reality, can be a
tedious and time-consuming. Recent image synthesis methods simplify this task
by offering tools to generate new views from as little as a single input image,
or by converting a semantic map into a photorealistic image. We propose to push
the envelope further, and introduce Generative View Synthesis (GVS), which can
synthesize multiple photorealistic views of a scene given a single semantic
map. We show that the sequential application of existing techniques, e.g.,
semantics-to-image translation followed by monocular view synthesis, fail at
capturing the scene's structure. In contrast, we solve the semantics-to-image
translation in concert with the estimation of the 3D layout of the scene, thus
producing geometrically consistent novel views that preserve semantic
structures. We first lift the input 2D semantic map onto a 3D layered
representation of the scene in feature space, thereby preserving the semantic
labels of 3D geometric structures. We then project the layered features onto
the target views to generate the final novel-view images. We verify the
strengths of our method and compare it with several advanced baselines on three
different datasets. Our approach also allows for style manipulation and image
editing operations, such as the addition or removal of objects, with simple
manipulations of the input style images and semantic maps respectively. Visit
the project page at https://gvsnet.github.io.
Related papers
- Neural Groundplans: Persistent Neural Scene Representations from a
Single Image [90.04272671464238]
We present a method to map 2D image observations of a scene to a persistent 3D scene representation.
We propose conditional neural groundplans as persistent and memory-efficient scene representations.
arXiv Detail & Related papers (2022-07-22T17:41:24Z) - DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene
Context Graph and Relation-based Optimization [66.25948693095604]
We propose a novel method for panoramic 3D scene understanding which recovers the 3D room layout and the shape, pose, position, and semantic category for each object from a single full-view panorama image.
Experiments demonstrate that our method outperforms existing methods on panoramic scene understanding in terms of both geometry accuracy and object arrangement.
arXiv Detail & Related papers (2021-08-24T13:55:29Z) - Realistic Image Synthesis with Configurable 3D Scene Layouts [59.872657806747576]
We propose a novel approach to realistic-looking image synthesis based on a 3D scene layout.
Our approach takes a 3D scene with semantic class labels as input and trains a 3D scene painting network.
With the trained painting network, realistic-looking images for the input 3D scene can be rendered and manipulated.
arXiv Detail & Related papers (2021-08-23T09:44:56Z) - Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a
Single Image [26.770326254205223]
We present Worldsheet, a method for novel view synthesis using just a single RGB image as input.
Worldsheet consistently outperforms prior state-of-the-art methods on single-image view synthesis across several datasets.
arXiv Detail & Related papers (2020-12-17T18:59:52Z) - Semantic View Synthesis [56.47999473206778]
We tackle a new problem of semantic view synthesis -- generating free-viewpoint rendering of a synthesized scene using a semantic label map as input.
First, we focus on synthesizing the color and depth of the visible surface of the 3D scene.
We then use the synthesized color and depth to impose explicit constraints on the multiple-plane image (MPI) representation prediction process.
arXiv Detail & Related papers (2020-08-24T17:59:46Z) - Continuous Object Representation Networks: Novel View Synthesis without
Target View Supervision [26.885846254261626]
Continuous Object Representation Networks (CORN) is a conditional architecture that encodes an input image's geometry and appearance that map to a 3D consistent scene representation.
CORN achieves well on challenging tasks such as novel view synthesis and single-view 3D reconstruction and performance comparable to state-of-the-art approaches that use direct supervision.
arXiv Detail & Related papers (2020-07-30T17:49:44Z) - Single-View View Synthesis with Multiplane Images [64.46556656209769]
We apply deep learning to generate multiplane images given two or more input images at known viewpoints.
Our method learns to predict a multiplane image directly from a single image input.
It additionally generates reasonable depth maps and fills in content behind the edges of foreground objects in background layers.
arXiv Detail & Related papers (2020-04-23T17:59:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.