WonderWorld: Interactive 3D Scene Generation from a Single Image
- URL: http://arxiv.org/abs/2406.09394v3
- Date: Tue, 10 Sep 2024 17:54:34 GMT
- Title: WonderWorld: Interactive 3D Scene Generation from a Single Image
- Authors: Hong-Xing Yu, Haoyi Duan, Charles Herrmann, William T. Freeman, Jiajun Wu,
- Abstract summary: We present WonderWorld, a novel framework for interactive 3D scene generation.
WonderWorld generates connected and diverse 3D scenes in less than 10 seconds on a single A6000 GPU.
- Score: 38.83667648993784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present WonderWorld, a novel framework for interactive 3D scene generation that enables users to interactively specify scene contents and layout and see the created scenes in low latency. The major challenge lies in achieving fast generation of 3D scenes. Existing scene generation approaches fall short of speed as they often require (1) progressively generating many views and depth maps, and (2) time-consuming optimization of the scene geometry representations. We introduce the Fast Layered Gaussian Surfels (FLAGS) as our scene representation and an algorithm to generate it from a single view. Our approach does not need multiple views, and it leverages a geometry-based initialization that significantly reduces optimization time. Another challenge is generating coherent geometry that allows all scenes to be connected. We introduce the guided depth diffusion that allows partial conditioning of depth estimation. WonderWorld generates connected and diverse 3D scenes in less than 10 seconds on a single A6000 GPU, enabling real-time user interaction and exploration. We demonstrate the potential of WonderWorld for user-driven content creation and exploration in virtual environments. We will release full code and software for reproducibility. Project website: https://kovenyu.com/WonderWorld/.
Related papers
- Toward Scene Graph and Layout Guided Complex 3D Scene Generation [31.396230860775415]
We present a novel framework of Scene Graph and Layout Guided 3D Scene Generation (GraLa3D)
Given a text prompt describing a complex 3D scene, GraLa3D utilizes LLM to model the scene using a scene graph representation with layout bounding box information.
GraLa3D uniquely constructs the scene graph with single-object nodes and composite super-nodes.
arXiv Detail & Related papers (2024-12-29T14:21:03Z) - SceneCraft: Layout-Guided 3D Scene Generation [29.713491313796084]
SceneCraft is a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences.
Our method significantly outperforms existing approaches in complex indoor scene generation with diverse textures, consistent geometry, and realistic visual quality.
arXiv Detail & Related papers (2024-10-11T17:59:58Z) - SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting [53.32467009064287]
We propose a text-driven 3D-consistent scene generation model: SceneDreamer360.
Our proposed method leverages a text-driven panoramic image generation model as a prior for 3D scene generation.
Our experiments demonstrate that SceneDreamer360 with its panoramic image generation and 3DGS can produce higher quality, spatially consistent, and visually appealing 3D scenes from any text prompt.
arXiv Detail & Related papers (2024-08-25T02:56:26Z) - LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation [105.52153675890408]
3D immersive scene generation is a challenging yet critical task in computer vision and graphics.
LayerPano3D is a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt.
arXiv Detail & Related papers (2024-08-23T17:50:23Z) - OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos [7.616167860385134]
It has long been challenging to recover the underlying dynamic 3D scene representations from a monocular RGB video.
We introduce a new framework, called OSN, to learn all plausible 3D scene configurations that match the input video.
Our method demonstrates a clear advantage in learning fine-grained 3D scene geometry.
arXiv Detail & Related papers (2024-07-08T05:03:46Z) - FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting [15.648080938815879]
We propose FastScene, a framework for fast and higher-quality 3D scene generation.
FastScene can generate a 3D scene within a mere 15 minutes, which is at least one hour faster than state-of-the-art methods.
arXiv Detail & Related papers (2024-05-09T13:44:16Z) - Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting [75.7154104065613]
We introduce a novel depth completion model, trained via teacher distillation and self-training to learn the 3D fusion process.
We also introduce a new benchmarking scheme for scene generation methods that is based on ground truth geometry.
arXiv Detail & Related papers (2024-04-30T17:59:40Z) - DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting [56.101576795566324]
We present a text-to-3D 360$circ$ scene generation pipeline.
Our approach utilizes the generative power of a 2D diffusion model and prompt self-refinement.
Our method offers a globally consistent 3D scene within a 360$circ$ perspective.
arXiv Detail & Related papers (2024-04-10T10:46:59Z) - WonderJourney: Going from Anywhere to Everywhere [75.1284367548585]
WonderJourney is a modularized framework for perpetual 3D scene generation.
We generate a journey through a long sequence of diverse yet coherently connected 3D scenes.
We show compelling, diverse visual results across various scene types and styles, forming imaginary "wonderjourneys"
arXiv Detail & Related papers (2023-12-06T20:22:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.