Imagine360: Immersive 360 Video Generation from Perspective Anchor
- URL: http://arxiv.org/abs/2412.03552v1
- Date: Wed, 04 Dec 2024 18:50:08 GMT
- Title: Imagine360: Immersive 360 Video Generation from Perspective Anchor
- Authors: Jing Tan, Shuai Yang, Tong Wu, Jingwen He, Yuwei Guo, Ziwei Liu, Dahua Lin,
- Abstract summary: Imagine360 is a perspective-to-$360circ$ video generation framework.
It learns fine-grained spherical visual and motion patterns from limited $360circ$ video data.
It achieves superior graphics quality and motion coherence among state-of-the-art $360circ$ video generation methods.
- Score: 79.97844408255897
- License:
- Abstract: $360^\circ$ videos offer a hyper-immersive experience that allows the viewers to explore a dynamic scene from full 360 degrees. To achieve more user-friendly and personalized content creation in $360^\circ$ video format, we seek to lift standard perspective videos into $360^\circ$ equirectangular videos. To this end, we introduce Imagine360, the first perspective-to-$360^\circ$ video generation framework that creates high-quality $360^\circ$ videos with rich and diverse motion patterns from video anchors. Imagine360 learns fine-grained spherical visual and motion patterns from limited $360^\circ$ video data with several key designs. 1) Firstly we adopt the dual-branch design, including a perspective and a panorama video denoising branch to provide local and global constraints for $360^\circ$ video generation, with motion module and spatial LoRA layers fine-tuned on extended web $360^\circ$ videos. 2) Additionally, an antipodal mask is devised to capture long-range motion dependencies, enhancing the reversed camera motion between antipodal pixels across hemispheres. 3) To handle diverse perspective video inputs, we propose elevation-aware designs that adapt to varying video masking due to changing elevations across frames. Extensive experiments show Imagine360 achieves superior graphics quality and motion coherence among state-of-the-art $360^\circ$ video generation methods. We believe Imagine360 holds promise for advancing personalized, immersive $360^\circ$ video creation.
Related papers
- From an Image to a Scene: Learning to Imagine the World from a Million 360 Videos [71.22810401256234]
Three-dimensional (3D) understanding of objects and scenes play a key role in humans' ability to interact with the world.
Large scale synthetic and object-centric 3D datasets have shown to be effective in training models that have 3D understanding of objects.
We introduce 360-1M, a 360 video dataset, and a process for efficiently finding corresponding frames from diverse viewpoints at scale.
arXiv Detail & Related papers (2024-12-10T18:59:44Z) - MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views [90.26609689682876]
We introduce MVSplat360, a feed-forward approach for 360deg novel view synthesis (NVS) of diverse real-world scenes, using only sparse observations.
This setting is inherently ill-posed due to minimal overlap among input views and insufficient visual information provided.
Our model is end-to-end trainable and supports rendering arbitrary views with as few as 5 sparse input views.
arXiv Detail & Related papers (2024-11-07T17:59:31Z) - DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline.
Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement.
Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z) - Dream360: Diverse and Immersive Outdoor Virtual Scene Creation via
Transformer-Based 360 Image Outpainting [33.95741744421632]
We propose a transformer-based 360 image outpainting framework called Dream360.
It can generate diverse, high-fidelity, and high-resolution panoramas from user-selected viewports.
Our Dream360 achieves significantly lower Frechet Inception Distance (FID) scores and better visual fidelity than existing methods.
arXiv Detail & Related papers (2024-01-19T09:01:20Z) - 360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model [23.708946172342067]
We propose a pipeline named 360-Degree Video Diffusion model (360DVD) for generating 360-degree panoramic videos.
We introduce a lightweight 360-Adapter accompanied by 360 Enhancement Techniques to transform pre-trained T2V models for panorama video generation.
We also propose a new panorama dataset named WEB360 consisting of panoramic video-text pairs for training 360DVD.
arXiv Detail & Related papers (2024-01-12T13:52:29Z) - See360: Novel Panoramic View Interpolation [24.965259708297932]
See360 is a versatile and efficient framework for 360 panoramic view using latent space viewpoint estimation.
We show that the proposed method is generic enough to achieve real-time rendering of arbitrary views for four datasets.
arXiv Detail & Related papers (2024-01-07T09:17:32Z) - NeO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes [59.15910989235392]
We introduce NeO 360, Neural fields for sparse view synthesis of outdoor scenes.
NeO 360 is a generalizable method that reconstructs 360deg scenes from a single or a few posed RGB images.
Our representation combines the best of both voxel-based and bird's-eye-view (BEV) representations.
arXiv Detail & Related papers (2023-08-24T17:59:50Z) - RenderMe-360: A Large Digital Asset Library and Benchmarks Towards
High-fidelity Head Avatars [157.82758221794452]
We present RenderMe-360, a comprehensive 4D human head dataset to drive advance in head avatar research.
It contains massive data assets, with 243+ million complete head frames, and over 800k video sequences from 500 different identities.
Based on the dataset, we build a comprehensive benchmark for head avatar research, with 16 state-of-the-art methods performed on five main tasks.
arXiv Detail & Related papers (2023-05-22T17:54:01Z) - SHD360: A Benchmark Dataset for Salient Human Detection in 360{\deg}
Videos [26.263614207849276]
We propose SHD360, the first 360deg video SHD dataset collecting various real-life daily scenes.
SHD360 contains 16,238 salient human instances with manually annotated pixel-wise ground truth.
Our proposed dataset and benchmark could serve as a good starting point for advancing human-centric researches towards 360deg panoramic data.
arXiv Detail & Related papers (2021-05-24T23:51:29Z) - Weakly-Supervised Multi-Person Action Recognition in 360$^{\circ}$
Videos [24.4517195084202]
We address the problem of action recognition in top-view 360$circ$ videos.
The proposed framework first transforms omnidirectional videos into panoramic videos, then it extracts spatial-temporal features using region-based 3D CNNs for action recognition.
We propose a weakly-supervised method based on multi-instance multi-label learning, which trains the model to recognize and localize multiple actions in a video using only video-level action labels as supervision.
arXiv Detail & Related papers (2020-02-09T02:17:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.