Related papers: Disentangled 3D Scene Generation with Layout Learning

Disentangled 3D Scene Generation with Layout Learning

URL: http://arxiv.org/abs/2402.16936v1
Date: Mon, 26 Feb 2024 18:54:15 GMT
Title: Disentangled 3D Scene Generation with Layout Learning
Authors: Dave Epstein, Ben Poole, Ben Mildenhall, Alexei A. Efros, Aleksander Holynski
Abstract summary: We introduce a method to generate 3D scenes that are disentangled into their component objects. Our key insight is that objects can be discovered by finding parts of a 3D scene that, when rearranged spatially, still produce valid configurations of the same scene. We show that despite its simplicity, our approach successfully generates 3D scenes into individual objects.
Score: 109.03233745767062
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a method to generate 3D scenes that are disentangled into their component objects. This disentanglement is unsupervised, relying only on the knowledge of a large pretrained text-to-image model. Our key insight is that objects can be discovered by finding parts of a 3D scene that, when rearranged spatially, still produce valid configurations of the same scene. Concretely, our method jointly optimizes multiple NeRFs from scratch - each representing its own object - along with a set of layouts that composite these objects into scenes. We then encourage these composited scenes to be in-distribution according to the image generator. We show that despite its simplicity, our approach successfully generates 3D scenes decomposed into individual objects, enabling new capabilities in text-to-3D content creation. For results and an interactive demo, see our project page at https://dave.ml/layoutlearning/

Related papers

Toward Scene Graph and Layout Guided Complex 3D Scene Generation [31.396230860775415]
We present a novel framework of Scene Graph and Layout Guided 3D Scene Generation (GraLa3D) Given a text prompt describing a complex 3D scene, GraLa3D utilizes LLM to model the scene using a scene graph representation with layout bounding box information. GraLa3D uniquely constructs the scene graph with single-object nodes and composite super-nodes.
arXiv Detail & Related papers (2024-12-29T14:21:03Z)
PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models [63.1432721793683]
We introduce PartGen, a novel approach that generates 3D objects composed of meaningful parts starting from text, an image, or an unstructured 3D object. We evaluate our method on generated and real 3D assets and show that it outperforms segmentation and part-extraction baselines by a large margin.
arXiv Detail & Related papers (2024-12-24T18:59:43Z)
SceneCraft: Layout-Guided 3D Scene Generation [29.713491313796084]
SceneCraft is a novel method for generating detailed indoor scenes that adhere to textual descriptions and spatial layout preferences. Our method significantly outperforms existing approaches in complex indoor scene generation with diverse textures, consistent geometry, and realistic visual quality.
arXiv Detail & Related papers (2024-10-11T17:59:58Z)
Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches [50.51643519253066]
3D Content Generation is at the heart of many computer graphics applications, including video gaming, film-making, virtual and augmented reality, etc. This paper proposes a novel deep-learning based approach for automatically generating interactive and playable 3D game scenes.
arXiv Detail & Related papers (2024-08-08T16:27:37Z)
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting [52.150502668874495]
We present GALA3D, generative 3D GAussians with LAyout-guided control, for effective compositional text-to-3D generation. GALA3D is a user-friendly, end-to-end framework for state-of-the-art scene-level 3D content generation and controllable editing.
arXiv Detail & Related papers (2024-02-11T13:40:08Z)
InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes [86.26588382747184]
We introduce InseRF, a novel method for generative object insertion in the NeRF reconstructions of 3D scenes. Based on a user-provided textual description and a 2D bounding box in a reference viewpoint, InseRF generates new objects in 3D scenes.
arXiv Detail & Related papers (2024-01-10T18:59:53Z)
SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets. We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z)
DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis [90.32352050266104]
DisCoScene is a 3Daware generative model for high-quality and controllable scene synthesis. It disentangles the whole scene into object-centric generative fields by learning on only 2D images with the global-local discrimination. We demonstrate state-of-the-art performance on many scene datasets, including the challenging outdoor dataset.
arXiv Detail & Related papers (2022-12-22T18:59:59Z)
Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images [18.24156991697044]
We present a method for creating 3D indoor scenes with a generative model learned from semantic-segmented depth images. Given a room with a specified size, our method automatically generates 3D objects in a room from a randomly sampled latent code. Compared to existing methods, our method not only efficiently reduces the workload of modeling and acquiring 3D scenes for training, but also produces better object shapes.
arXiv Detail & Related papers (2021-08-20T06:22:49Z)
Disentangling 3D Prototypical Networks For Few-Shot Concept Learning [29.02523358573336]
We present neural architectures that disentangle RGB-D images into objects' shapes and styles and a map of the background scene. Our networks incorporate architectural biases that reflect the image formation process, 3D geometry of the world scene, and shape-style interplay.
arXiv Detail & Related papers (2020-11-06T14:08:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.