VidPanos: Generative Panoramic Videos from Casual Panning Videos
- URL: http://arxiv.org/abs/2410.13832v2
- Date: Sun, 27 Oct 2024 23:13:38 GMT
- Title: VidPanos: Generative Panoramic Videos from Casual Panning Videos
- Authors: Jingwei Ma, Erika Lu, Roni Paiss, Shiran Zada, Aleksander Holynski, Tali Dekel, Brian Curless, Michael Rubinstein, Forrester Cole,
- Abstract summary: Panoramic image stitching provides a unified, wide-angle view of a scene that extends beyond the camera's field of view.
We present a method for synthesizing a panoramic video from a casually-captured panning video.
Our system can create video panoramas for a range of in-the-wild scenes including people, vehicles, and flowing water.
- Score: 73.77443496436749
- License:
- Abstract: Panoramic image stitching provides a unified, wide-angle view of a scene that extends beyond the camera's field of view. Stitching frames of a panning video into a panoramic photograph is a well-understood problem for stationary scenes, but when objects are moving, a still panorama cannot capture the scene. We present a method for synthesizing a panoramic video from a casually-captured panning video, as if the original video were captured with a wide-angle camera. We pose panorama synthesis as a space-time outpainting problem, where we aim to create a full panoramic video of the same length as the input video. Consistent completion of the space-time volume requires a powerful, realistic prior over video content and motion, for which we adapt generative video models. Existing generative models do not, however, immediately extend to panorama completion, as we show. We instead apply video generation as a component of our panorama synthesis system, and demonstrate how to exploit the strengths of the models while minimizing their limitations. Our system can create video panoramas for a range of in-the-wild scenes including people, vehicles, and flowing water, as well as stationary background features.
Related papers
- DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention [62.2447324481159]
Cavia is a novel framework for camera-controllable, multi-view video generation.
Our framework extends the spatial and temporal attention modules, improving both viewpoint and temporal consistency.
Cavia is the first of its kind that allows the user to specify distinct camera motion while obtaining object motion.
arXiv Detail & Related papers (2024-10-14T17:46:32Z) - Neural Light Spheres for Implicit Image Stitching and View Synthesis [32.396278546192995]
Spherical neural light field model for implicit panoramic image stitching and re-rendering.
We show improved reconstruction quality over traditional image stitching and radiance field methods.
arXiv Detail & Related papers (2024-09-26T15:05:29Z) - LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation [105.52153675890408]
3D immersive scene generation is a challenging yet critical task in computer vision and graphics.
LayerPano3D is a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt.
arXiv Detail & Related papers (2024-08-23T17:50:23Z) - PanoVOS: Bridging Non-panoramic and Panoramic Views with Transformer for Video Segmentation [39.269864548255576]
We present a panoramic video dataset, PanoVOS.
The dataset provides 150 videos with high video resolutions and diverse motions.
We present a Panoramic Space Consistency Transformer (PSCFormer) which can effectively utilize the semantic boundary information of the previous frame for pixel-level matching with the current frame.
arXiv Detail & Related papers (2023-09-21T17:59:02Z) - HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular
Video [44.58519508310171]
We introduce a free-viewpoint rendering method -- HumanNeRF -- that works on a given monocular video of a human performing complex body motions.
Our method enables pausing the video at any frame and rendering the subject from arbitrary new camera viewpoints.
arXiv Detail & Related papers (2022-01-11T18:51:21Z) - Sampling Based Scene-Space Video Processing [89.49726406622842]
We present a novel, sampling-based framework for processing video.
It enables high-quality scene-space video effects in the presence of inevitable errors in depth and camera pose estimation.
We present results for various casually captured, hand-held, moving, compressed, monocular videos.
arXiv Detail & Related papers (2021-02-05T05:55:04Z) - Infinite Nature: Perpetual View Generation of Natural Scenes from a
Single Image [73.56631858393148]
We introduce the problem of perpetual view generation -- long-range generation of novel views corresponding to an arbitrarily long camera trajectory given a single image.
We take a hybrid approach that integrates both geometry and image synthesis in an iterative render, refine, and repeat framework.
Our approach can be trained from a set of monocular video sequences without any manual annotation.
arXiv Detail & Related papers (2020-12-17T18:59:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.