Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
- URL: http://arxiv.org/abs/2504.07940v2
- Date: Thu, 17 Apr 2025 14:35:15 GMT
- Title: Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos
- Authors: Rundong Luo, Matthew Wallingford, Ali Farhadi, Noah Snavely, Wei-Chiu Ma,
- Abstract summary: 360deg videos offer a more complete perspective of our surroundings.<n>Existing video models excel at producing standard videos, but their ability to generate full panoramic videos remains elusive.<n>We develop a high-quality data filtering pipeline to curate pairwise training data and improve the quality of 360deg video generation.<n> Experimental results demonstrate that our model can generate realistic and coherent 360deg videos from in-the-wild perspective video.
- Score: 64.10180665546237
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 360{\deg} videos have emerged as a promising medium to represent our dynamic visual world. Compared to the "tunnel vision" of standard cameras, their borderless field of view offers a more complete perspective of our surroundings. While existing video models excel at producing standard videos, their ability to generate full panoramic videos remains elusive. In this paper, we investigate the task of video-to-360{\deg} generation: given a perspective video as input, our goal is to generate a full panoramic video that is consistent with the original video. Unlike conventional video generation tasks, the output's field of view is significantly larger, and the model is required to have a deep understanding of both the spatial layout of the scene and the dynamics of objects to maintain spatio-temporal consistency. To address these challenges, we first leverage the abundant 360{\deg} videos available online and develop a high-quality data filtering pipeline to curate pairwise training data. We then carefully design a series of geometry- and motion-aware operations to facilitate the learning process and improve the quality of 360{\deg} video generation. Experimental results demonstrate that our model can generate realistic and coherent 360{\deg} videos from in-the-wild perspective video. In addition, we showcase its potential applications, including video stabilization, camera viewpoint control, and interactive visual question answering.
Related papers
- VideoPanda: Video Panoramic Diffusion with Multi-view Attention [57.87428280844657]
High resolution panoramic video content is paramount for immersive experiences in Virtual Reality, but is non-trivial to collect as it requires specialized equipment and intricate camera setups.
VideoPanda generates more realistic and coherent 360$circ$ panoramas across all input conditions compared to existing methods.
arXiv Detail & Related papers (2025-04-15T16:58:15Z) - WorldPrompter: Traversable Text-to-Scene Generation [18.405299478122693]
We introduce WorldPrompter, a novel generative pipeline for synthesizing traversable 3D scenes from text prompts.<n>WorldPrompter incorporates a conditional 360deg panoramic video generator, capable of producing a 128-frame video that simulates a person walking through and capturing a virtual environment.<n>The resulting video is then reconstructed as Gaussian splats by a fast feedforward 3D reconstructor, enabling a true walkable experience within the 3D scene.
arXiv Detail & Related papers (2025-04-02T18:04:32Z) - T-SVG: Text-Driven Stereoscopic Video Generation [87.62286959918566]
This paper introduces the Text-driven Stereoscopic Video Generation (T-SVG) system.<n>It streamlines video generation by using text prompts to create reference videos.<n>These videos are transformed into 3D point cloud sequences, which are rendered from two perspectives with subtle parallax differences.
arXiv Detail & Related papers (2024-12-12T14:48:46Z) - From an Image to a Scene: Learning to Imagine the World from a Million 360 Videos [71.22810401256234]
Three-dimensional (3D) understanding of objects and scenes play a key role in humans' ability to interact with the world.<n>Large scale synthetic and object-centric 3D datasets have shown to be effective in training models that have 3D understanding of objects.<n>We introduce 360-1M, a 360 video dataset, and a process for efficiently finding corresponding frames from diverse viewpoints at scale.
arXiv Detail & Related papers (2024-12-10T18:59:44Z) - Imagine360: Immersive 360 Video Generation from Perspective Anchor [79.97844408255897]
Imagine360 is a perspective-to-$360circ$ video generation framework.<n>It learns fine-grained spherical visual and motion patterns from limited $360circ$ video data.<n>It achieves superior graphics quality and motion coherence among state-of-the-art $360circ$ video generation methods.
arXiv Detail & Related papers (2024-12-04T18:50:08Z) - Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention [62.2447324481159]
Cavia is a novel framework for camera-controllable, multi-view video generation.
Our framework extends the spatial and temporal attention modules, improving both viewpoint and temporal consistency.
Cavia is the first of its kind that allows the user to specify distinct camera motion while obtaining object motion.
arXiv Detail & Related papers (2024-10-14T17:46:32Z) - SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix [60.48666051245761]
We propose a pose-free and training-free approach for generating 3D stereoscopic videos.
Our method warps a generated monocular video into camera views on stereoscopic baseline using estimated video depth.
We develop a disocclusion boundary re-injection scheme that further improves the quality of video inpainting.
arXiv Detail & Related papers (2024-06-29T08:33:55Z) - 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model [23.708946172342067]
We propose a pipeline named 360-Degree Video Diffusion model (360DVD) for generating 360-degree panoramic videos.
We introduce a lightweight 360-Adapter accompanied by 360 Enhancement Techniques to transform pre-trained T2V models for panorama video generation.
We also propose a new panorama dataset named WEB360 consisting of panoramic video-text pairs for training 360DVD.
arXiv Detail & Related papers (2024-01-12T13:52:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.