LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
- URL: http://arxiv.org/abs/2408.13252v1
- Date: Fri, 23 Aug 2024 17:50:23 GMT
- Title: LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation
- Authors: Shuai Yang, Jing Tan, Mengchen Zhang, Tong Wu, Yixuan Li, Gordon Wetzstein, Ziwei Liu, Dahua Lin,
- Abstract summary: 3D immersive scene generation is a challenging yet critical task in computer vision and graphics.
LayerPano3D is a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt.
- Score: 105.52153675890408
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D immersive scene generation is a challenging yet critical task in computer vision and graphics. A desired virtual 3D scene should 1) exhibit omnidirectional view consistency, and 2) allow for free exploration in complex scene hierarchies. Existing methods either rely on successive scene expansion via inpainting or employ panorama representation to represent large FOV scene environments. However, the generated scene suffers from semantic drift during expansion and is unable to handle occlusion among scene hierarchies. To tackle these challenges, we introduce LayerPano3D, a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt. Our key insight is to decompose a reference 2D panorama into multiple layers at different depth levels, where each layer reveals the unseen space from the reference views via diffusion prior. LayerPano3D comprises multiple dedicated designs: 1) we introduce a novel text-guided anchor view synthesis pipeline for high-quality, consistent panorama generation. 2) We pioneer the Layered 3D Panorama as underlying representation to manage complex scene hierarchies and lift it into 3D Gaussians to splat detailed 360-degree omnidirectional scenes with unconstrained viewing paths. Extensive experiments demonstrate that our framework generates state-of-the-art 3D panoramic scene in both full view consistency and immersive exploratory experience. We believe that LayerPano3D holds promise for advancing 3D panoramic scene creation with numerous applications.
Related papers
- WorldPrompter: Traversable Text-to-Scene Generation [18.405299478122693]
We introduce WorldPrompter, a novel generative pipeline for synthesizing traversable 3D scenes from text prompts.
WorldPrompter incorporates a conditional 360deg panoramic video generator, capable of producing a 128-frame video that simulates a person walking through and capturing a virtual environment.
The resulting video is then reconstructed as Gaussian splats by a fast feedforward 3D reconstructor, enabling a true walkable experience within the 3D scene.
arXiv Detail & Related papers (2025-04-02T18:04:32Z) - Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration [18.23983135970619]
We propose a novel layered 3D scene reconstruction framework from panoramic image, named Scene4U.
Specifically, Scene4U integrates an open-vocabulary segmentation model with a large language model to decompose a real panorama into multiple layers.
We then employ a layered repair module based on diffusion model to restore occluded regions using visual cues and depth information, generating a hierarchical representation of the scene.
Scene4U outperforms state-of-the-art method, improving by 24.24% in LPIPS and 24.40% in BRISQUE, while also achieving the fastest training speed.
arXiv Detail & Related papers (2025-04-01T03:17:24Z) - Splatter-360: Generalizable 360$^{\circ}$ Gaussian Splatting for Wide-baseline Panoramic Images [52.48351378615057]
textitSplatter-360 is a novel end-to-end generalizable 3DGS framework to handle wide-baseline panoramic images.
We introduce a 3D-aware bi-projection encoder to mitigate the distortions inherent in panoramic images.
This enables robust 3D-aware feature representations and real-time rendering capabilities.
arXiv Detail & Related papers (2024-12-09T06:58:31Z) - DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [60.45000652592418]
We propose a novel text-driven panoramic generation framework, DiffPano, to achieve scalable, consistent, and diverse panoramic scene generation.
We show that DiffPano can generate consistent, diverse panoramic images with given unseen text descriptions and camera poses.
arXiv Detail & Related papers (2024-10-31T17:57:02Z) - SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting [53.32467009064287]
We propose a text-driven 3D-consistent scene generation model: SceneDreamer360.
Our proposed method leverages a text-driven panoramic image generation model as a prior for 3D scene generation.
Our experiments demonstrate that SceneDreamer360 with its panoramic image generation and 3DGS can produce higher quality, spatially consistent, and visually appealing 3D scenes from any text prompt.
arXiv Detail & Related papers (2024-08-25T02:56:26Z) - HoloDreamer: Holistic 3D Panoramic World Generation from Text Descriptions [31.342899807980654]
3D scene generation is in high demand across various domains, including virtual reality, gaming, and the film industry.
We introduce HoloDreamer, a framework that first generates high-definition panorama as a holistic initialization of the full 3D scene.
We then leverage 3D Gaussian Splatting (3D-GS) to quickly reconstruct the 3D scene, thereby facilitating the creation of view-consistent and fully enclosed 3D scenes.
arXiv Detail & Related papers (2024-07-21T14:52:51Z) - FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting [15.648080938815879]
We propose FastScene, a framework for fast and higher-quality 3D scene generation.
FastScene can generate a 3D scene within a mere 15 minutes, which is at least one hour faster than state-of-the-art methods.
arXiv Detail & Related papers (2024-05-09T13:44:16Z) - DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline.
Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement.
Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z) - PERF: Panoramic Neural Radiance Field from a Single Panorama [109.31072618058043]
PERF is a novel view synthesis framework that trains a panoramic neural radiance field from a single panorama.
We propose a novel collaborative RGBD inpainting method and a progressive inpainting-and-erasing method to lift up a 360-degree 2D scene to a 3D scene.
Our PERF can be widely used for real-world applications, such as panorama-to-3D, text-to-3D, and 3D scene stylization applications.
arXiv Detail & Related papers (2023-10-25T17:59:01Z) - SceneDreamer: Unbounded 3D Scene Generation from 2D Image Collections [49.802462165826554]
We present SceneDreamer, an unconditional generative model for unbounded 3D scenes.
Our framework is learned from in-the-wild 2D image collections only, without any 3D annotations.
arXiv Detail & Related papers (2023-02-02T18:59:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.