Related papers: DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis

DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis

URL: http://arxiv.org/abs/2510.02178v1
Date: Thu, 02 Oct 2025 16:30:37 GMT
Title: DisCo-Layout: Disentangling and Coordinating Semantic and Physical Refinement in a Multi-Agent Framework for 3D Indoor Layout Synthesis
Authors: Jialin Gao, Donghao Zhou, Mingjian Liang, Lihao Liu, Chi-Wing Fu, Xiaowei Hu, Pheng-Ann Heng,
Abstract summary: 3D indoor layout synthesis is crucial for creating virtual environments.<n>DisCo is a novel framework that disentangles and coordinates physical and semantic refinement.
Score: 76.7196710324494
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D indoor layout synthesis is crucial for creating virtual environments. Traditional methods struggle with generalization due to fixed datasets. While recent LLM and VLM-based approaches offer improved semantic richness, they often lack robust and flexible refinement, resulting in suboptimal layouts. We develop DisCo-Layout, a novel framework that disentangles and coordinates physical and semantic refinement. For independent refinement, our Semantic Refinement Tool (SRT) corrects abstract object relationships, while the Physical Refinement Tool (PRT) resolves concrete spatial issues via a grid-matching algorithm. For collaborative refinement, a multi-agent framework intelligently orchestrates these tools, featuring a planner for placement rules, a designer for initial layouts, and an evaluator for assessment. Experiments demonstrate DisCo-Layout's state-of-the-art performance, generating realistic, coherent, and generalizable 3D indoor layouts. Our code will be publicly available.

Related papers

RoomEditor++: A Parameter-Sharing Diffusion Architecture for High-Fidelity Furniture Synthesis [89.26382925677301]
Virtual furniture synthesis holds substantial promise for home design and e-commerce applications.<n>RoomEditor++ is a versatile diffusion-based architecture featuring a parameter-sharing dual diffusion backbone.<n>RoomEditor++ is superior over state-of-the-art approaches in terms of quantitative metrics, qualitative assessments, and human preference studies.
arXiv Detail & Related papers (2025-12-19T13:39:43Z)
Light-SQ: Structure-aware Shape Abstraction with Superquadrics for Generated Meshes [60.92139345612904]
We present Light-SQ, a novel superquadric-based optimization framework.<n>We propose a block-regrow-fill strategy guided by structure-aware volumetric decomposition.<n>Experiments demonstrate that Light-SQ enables efficient, high-fidelity, and editable shape abstraction with superquadrics.
arXiv Detail & Related papers (2025-09-29T16:18:32Z)
RoomCraft: Controllable and Complete 3D Indoor Scene Generation [51.19602078504066]
RoomCraft is a multi-stage pipeline that converts real images, sketches, or text descriptions into coherent 3D indoor scenes.<n>Our approach combines a scene generation pipeline with a constraint-driven optimization framework.<n>RoomCraft significantly outperforms existing methods in generating realistic, semantically coherent, and visually appealing room layouts.
arXiv Detail & Related papers (2025-06-27T15:03:17Z)
OptiScene: LLM-driven Indoor Scene Layout Generation via Scaled Human-aligned Data Synthesis and Multi-Stage Preference Optimization [54.60030826635478]
Existing indoor layout generation methods fall into two categories: prompt-driven and learning-based.<n>We present 3D- SynthPlace, a large-scale dataset that combines synthetic layouts generated via a 'GPT synthesize, Human inspect' pipeline.<n>We introduce OptiScene, a strong open-source LLM optimized for indoor layout generation.
arXiv Detail & Related papers (2025-06-09T09:13:06Z)
SceneLCM: End-to-End Layout-Guided Interactive Indoor Scene Generation with Latent Consistency Model [45.648346391757336]
SceneLCM is an end-to-end framework that synergizes Large Language Model (LLM) for layout design with Latent Consistency Model(LCM) for scene optimization.<n>SceneLCM supports physically editing by integrating physical simulation, achieved persistent physical realism.
arXiv Detail & Related papers (2025-06-08T11:30:31Z)
Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning [27.872834485482276]
3D indoor scene synthesis is vital for embodied AI and digital content creation.<n>Existing methods fail to generate scenes that are both open-vocabulary and aligned with fine-grained user instructions.<n>We introduce Direct, a framework that directly generates numerical 3D layouts from text descriptions.
arXiv Detail & Related papers (2025-06-05T17:59:42Z)
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse [5.745502268935752]
We present MetaSpatial, the first reinforcement learning-based framework designed to enhance 3D spatial reasoning in vision-language models (VLMs)<n>Our key innovation is a multi-turn RL-based optimization mechanism that integrates physics-aware constraints and rendered image evaluations, ensuring generated 3D layouts are coherent, physically plausible, and aesthetically consistent.
arXiv Detail & Related papers (2025-03-24T09:18:01Z)
LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models [57.92316645992816]
Spatial reasoning is a fundamental aspect of human cognition, enabling intuitive understanding and manipulation of objects in three-dimensional space.<n>We introduce LayoutVLM, a framework and scene layout representation that exploits the semantic knowledge of Vision-Language Models (VLMs)<n>We demonstrate that fine-tuning VLMs with the proposed scene layout representation extracted from existing scene datasets can improve their reasoning performance.
arXiv Detail & Related papers (2024-12-03T06:15:04Z)
Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model designed to synthesize plausible 3D indoor scenes.<n>We show it outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.