Street-view Panoramic Video Synthesis from a Single Satellite Image
- URL: http://arxiv.org/abs/2012.06628v2
- Date: Thu, 17 Dec 2020 08:43:00 GMT
- Title: Street-view Panoramic Video Synthesis from a Single Satellite Image
- Authors: Zuoyue Li, Zhaopeng Cui, Martin R. Oswald, Marc Pollefeys
- Abstract summary: We present a novel method for synthesizing both temporally and geometrically consistent street-view panoramic video.
Existing cross-view synthesis approaches focus more on images, while video synthesis in such a case has not yet received enough attention.
- Score: 92.26826861266784
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel method for synthesizing both temporally and geometrically
consistent street-view panoramic video from a given single satellite image and
camera trajectory. Existing cross-view synthesis approaches focus more on
images, while video synthesis in such a case has not yet received enough
attention. Single image synthesis approaches are not well suited for video
synthesis since they lack temporal consistency which is a crucial property of
videos. To this end, our approach explicitly creates a 3D point cloud
representation of the scene and maintains dense 3D-2D correspondences across
frames that reflect the geometric scene configuration inferred from the
satellite view. We implement a cascaded network architecture with two hourglass
modules for successive coarse and fine generation for colorizing the point
cloud from the semantics and per-class latent vectors. By leveraging computed
correspondences, the produced street-view video frames adhere to the 3D
geometric scene structure and maintain temporal consistency. Qualitative and
quantitative experiments demonstrate superior results compared to other
state-of-the-art cross-view synthesis approaches that either lack temporal or
geometric consistency. To the best of our knowledge, our work is the first work
to synthesize cross-view images to video.
Related papers
- MultiDiff: Consistent Novel View Synthesis from a Single Image [60.04215655745264]
MultiDiff is a novel approach for consistent novel view synthesis of scenes from a single RGB image.
Our results demonstrate that MultiDiff outperforms state-of-the-art methods on the challenging, real-world datasets RealEstate10K and ScanNet.
arXiv Detail & Related papers (2024-06-26T17:53:51Z) - Learning to Render Novel Views from Wide-Baseline Stereo Pairs [26.528667940013598]
We introduce a method for novel view synthesis given only a single wide-baseline stereo image pair.
Existing approaches to novel view synthesis from sparse observations fail due to recovering incorrect 3D geometry.
We propose an efficient, image-space epipolar line sampling scheme to assemble image features for a target ray.
arXiv Detail & Related papers (2023-04-17T17:40:52Z) - WALDO: Future Video Synthesis using Object Layer Decomposition and
Parametric Flow Prediction [82.79642869586587]
WALDO is a novel approach to the prediction of future video frames from past ones.
Individual images are decomposed into multiple layers combining object masks and a small set of control points.
The layer structure is shared across all frames in each video to build dense inter-frame connections.
arXiv Detail & Related papers (2022-11-25T18:59:46Z) - Condensing a Sequence to One Informative Frame for Video Recognition [113.3056598548736]
This paper studies a two-step alternative that first condenses the video sequence to an informative "frame"
A valid question is how to define "useful information" and then distill from a sequence down to one synthetic frame.
IFS consistently demonstrates evident improvements on image-based 2D networks and clip-based 3D networks.
arXiv Detail & Related papers (2022-01-11T16:13:43Z) - Geometry-Guided Street-View Panorama Synthesis from Satellite Imagery [80.6282101835164]
We present a new approach for synthesizing a novel street-view panorama given an overhead satellite image.
Our method generates a Google's omnidirectional street-view type panorama, as if it is captured from the same geographical location as the center of the satellite patch.
arXiv Detail & Related papers (2021-03-02T10:27:05Z) - Deep View Synthesis via Self-Consistent Generative Network [41.34461086700849]
View synthesis aims to produce unseen views from a set of views captured by two or more cameras at different positions.
To address this issue, most existing methods seek to exploit the geometric information to match pixels.
We propose a novel deep generative model, called Self-Consistent Generative Network (SCGN), which synthesizes novel views without explicitly exploiting the geometric information.
arXiv Detail & Related papers (2021-01-19T10:56:00Z) - Stable View Synthesis [100.86844680362196]
We present Stable View Synthesis (SVS)
Given a set of source images depicting a scene from freely distributed viewpoints, SVS synthesizes new views of the scene.
SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse real-world datasets.
arXiv Detail & Related papers (2020-11-14T07:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.