Related papers: RoomCraft: Controllable and Complete 3D Indoor Scene Generation

RoomCraft: Controllable and Complete 3D Indoor Scene Generation

URL: http://arxiv.org/abs/2506.22291v1
Date: Fri, 27 Jun 2025 15:03:17 GMT
Title: RoomCraft: Controllable and Complete 3D Indoor Scene Generation
Authors: Mengqi Zhou, Xipeng Wang, Yuxi Wang, Zhaoxiang Zhang,
Abstract summary: RoomCraft is a multi-stage pipeline that converts real images, sketches, or text descriptions into coherent 3D indoor scenes.<n>Our approach combines a scene generation pipeline with a constraint-driven optimization framework.<n>RoomCraft significantly outperforms existing methods in generating realistic, semantically coherent, and visually appealing room layouts.
Score: 51.19602078504066
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generating realistic 3D indoor scenes from user inputs remains a challenging problem in computer vision and graphics, requiring careful balance of geometric consistency, spatial relationships, and visual realism. While neural generation methods often produce repetitive elements due to limited global spatial reasoning, procedural approaches can leverage constraints for controllable generation but struggle with multi-constraint scenarios. When constraints become numerous, object collisions frequently occur, forcing the removal of furniture items and compromising layout completeness. To address these limitations, we propose RoomCraft, a multi-stage pipeline that converts real images, sketches, or text descriptions into coherent 3D indoor scenes. Our approach combines a scene generation pipeline with a constraint-driven optimization framework. The pipeline first extracts high-level scene information from user inputs and organizes it into a structured format containing room type, furniture items, and spatial relations. It then constructs a spatial relationship network to represent furniture arrangements and generates an optimized placement sequence using a heuristic-based depth-first search (HDFS) algorithm to ensure layout coherence. To handle complex multi-constraint scenarios, we introduce a unified constraint representation that processes both formal specifications and natural language inputs, enabling flexible constraint-oriented adjustments through a comprehensive action space design. Additionally, we propose a Conflict-Aware Positioning Strategy (CAPS) that dynamically adjusts placement weights to minimize furniture collisions and ensure layout completeness. Extensive experiments demonstrate that RoomCraft significantly outperforms existing methods in generating realistic, semantically coherent, and visually appealing room layouts across diverse input modalities.

Related papers

Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning [27.872834485482276]
3D indoor scene synthesis is vital for embodied AI and digital content creation.<n>Existing methods fail to generate scenes that are both open-vocabulary and aligned with fine-grained user instructions.<n>We introduce Direct, a framework that directly generates numerical 3D layouts from text descriptions.
arXiv Detail & Related papers (2025-06-05T17:59:42Z)
HOSIG: Full-Body Human-Object-Scene Interaction Generation with Hierarchical Scene Perception [57.37135310143126]
HO SIG is a novel framework for synthesizing full-body interactions through hierarchical scene perception.<n>Our framework supports unlimited motion length through autoregressive generation and requires minimal manual intervention.<n>This work bridges the critical gap between scene-aware navigation and dexterous object manipulation.
arXiv Detail & Related papers (2025-06-02T12:08:08Z)
HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation [50.206100327643284]
HiScene is a novel hierarchical framework that bridges the gap between 2D image generation and 3D object generation.<n>We generate 3D content that aligns with 2D representations while maintaining compositional structure.
arXiv Detail & Related papers (2025-04-17T16:33:39Z)
DeBaRA: Denoising-Based 3D Room Arrangement Generation [22.96293773013579]
We introduce DeBaRA, a score-based model specifically tailored for precise, controllable and flexible arrangement generation in a bounded environment. We demonstrate that by focusing on spatial attributes of objects, a single trained DeBaRA model can be leveraged at test time to perform several downstream applications such as scene synthesis, completion and re-arrangement.
arXiv Detail & Related papers (2024-09-26T23:18:25Z)
Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model designed to synthesize plausible 3D indoor scenes.<n>We show it outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z)
Style-Consistent 3D Indoor Scene Synthesis with Decoupled Objects [84.45345829270626]
Controllable 3D indoor scene synthesis stands at the forefront of technological progress. Current methods for scene stylization are limited to applying styles to the entire scene. We introduce a unique pipeline designed for synthesis 3D indoor scenes.
arXiv Detail & Related papers (2024-01-24T03:10:36Z)
CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes. Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes. The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.