Related papers: DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

URL: http://arxiv.org/abs/2303.14207v2
Date: Tue, 12 Mar 2024 15:40:08 GMT
Title: DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis
Authors: Jiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nie{\ss}ner
Abstract summary: We present DiffuScene for indoor 3D scene synthesis based on a novel scene configuration denoising diffusion model. It generates 3D instance properties stored in an unordered object set and retrieves the most similar geometry for each object configuration.
Score: 44.521452102413534
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We present DiffuScene for indoor 3D scene synthesis based on a novel scene configuration denoising diffusion model. It generates 3D instance properties stored in an unordered object set and retrieves the most similar geometry for each object configuration, which is characterized as a concatenation of different attributes, including location, size, orientation, semantics, and geometry features. We introduce a diffusion network to synthesize a collection of 3D indoor objects by denoising a set of unordered object attributes. Unordered parametrization simplifies and eases the joint distribution approximation. The shape feature diffusion facilitates natural object placements, including symmetries. Our method enables many downstream applications, including scene completion, scene arrangement, and text-conditioned scene synthesis. Experiments on the 3D-FRONT dataset show that our method can synthesize more physically plausible and diverse indoor scenes than state-of-the-art methods. Extensive ablation studies verify the effectiveness of our design choice in scene diffusion models.

Related papers

CasaGPT: Cuboid Arrangement and Scene Assembly for Interior Design [35.11283253765395]
We present a novel approach for indoor scene synthesis, which learns to arrange decomposed cuboid primitives to represent 3D objects within a scene. Our approach, coined CasaGPT for Cuboid Arrangement and Scene Assembly, employs an autoregressive model to sequentially arrange cuboids, producing physically plausible scenes.
arXiv Detail & Related papers (2025-04-28T04:35:04Z)
Layout2Scene: 3D Semantic Layout Guided Scene Generation via Geometry and Appearance Diffusion Priors [52.63385546943866]
We present a text-to-scene generation method (namely, Layout2Scene) using additional semantic layout as the prompt to inject precise control of 3D object positions. To fully leverage 2D diffusion priors in geometry and appearance generation, we introduce a semantic-guided geometry diffusion model and a semantic-geometry guided diffusion model. Our method can generate more plausible and realistic scenes as compared to state-of-the-art approaches.
arXiv Detail & Related papers (2025-01-05T12:20:13Z)
Open-Vocabulary 3D Semantic Segmentation with Text-to-Image Diffusion Models [57.37244894146089]
We propose Diff2Scene, which leverages frozen representations from text-image generative models, along with salient-aware and geometric-aware masks, for open-vocabulary 3D semantic segmentation and visual grounding tasks. We show that it outperforms competitive baselines and achieves significant improvements over state-of-the-art methods.
arXiv Detail & Related papers (2024-07-18T16:20:56Z)
Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model architecture. We represent a scene layout by a 2D floor plan and a set of objects, each defined by its category, location, size, and orientation. Our experimental results demonstrate that MiDiffusion substantially outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z)
DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling [23.06464506261766]
We present DreamScape, a method for creating highly consistent 3D scenes solely from textual descriptions. Our approach involves a 3D Gaussian Guide for scene representation, consisting of semantic primitives (objects) and their spatial transformations. A progressive scale control is tailored during local object generation, ensuring that objects of different sizes and densities adapt to the scene.
arXiv Detail & Related papers (2024-04-14T12:13:07Z)
HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data [42.49031063635004]
We propose HOIDiffusion for generating realistic and diverse 3D hand-object interaction data. Our model is a conditional diffusion model that takes both the 3D hand-object geometric structure and text description as inputs for image synthesis. We adopt the generated 3D data for learning 6D object pose estimation and show its effectiveness in improving perception systems.
arXiv Detail & Related papers (2024-03-18T17:48:31Z)
InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior [27.773451301040424]
InstructScene is a novel generative framework that integrates a semantic graph prior and a layout decoder. We show that the proposed method surpasses existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2024-02-07T10:09:00Z)
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes. Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes. The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z)
Generative Novel View Synthesis with 3D-Aware Diffusion Models [96.78397108732233]
We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image. Our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume. In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences.
arXiv Detail & Related papers (2023-04-05T17:15:47Z)
Scene Synthesis via Uncertainty-Driven Attribute Synchronization [52.31834816911887]
This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes. Our method combines the strength of both neural network-based and conventional scene synthesis approaches.
arXiv Detail & Related papers (2021-08-30T19:45:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.