Related papers: FloorPlan-DeepSeek (FPDS): A multimodal approach to floorplan generation using vector-based next room prediction

FloorPlan-DeepSeek (FPDS): A multimodal approach to floorplan generation using vector-based next room prediction

URL: http://arxiv.org/abs/2506.21562v2
Date: Sat, 02 Aug 2025 18:27:22 GMT
Title: FloorPlan-DeepSeek (FPDS): A multimodal approach to floorplan generation using vector-based next room prediction
Authors: Jun Yin, Pengyu Zeng, Jing Zhong, Peilin Li, Miao Zhang, Ran Luo, Shuai Lu,
Abstract summary: Existing generative models for floor plans are predominantly end-to-end generation that produce an entire pixel-based layout in a single pass.<n>We propose a novel 'next room prediction' paradigm tailored to architectural floor plan modeling.<n> FPDS demonstrates competitive performance in comparison to diffusion models and Tell2Design in the text-to-floorplan task.
Score: 25.35768485637194
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the architectural design process, floor plan generation is inherently progressive and iterative. However, existing generative models for floor plans are predominantly end-to-end generation that produce an entire pixel-based layout in a single pass. This paradigm is often incompatible with the incremental workflows observed in real-world architectural practice. To address this issue, we draw inspiration from the autoregressive 'next token prediction' mechanism commonly used in large language models, and propose a novel 'next room prediction' paradigm tailored to architectural floor plan modeling. Experimental evaluation indicates that FPDS demonstrates competitive performance in comparison to diffusion models and Tell2Design in the text-to-floorplan task, indicating its potential applicability in supporting future intelligent architectural design.

Related papers

FloorplanMAE:A self-supervised framework for complete floorplan generation from partial inputs [25.35768485637194]
We propose FloorplanMAE, a self-supervised learning framework for restoring incomplete floor plans into complete ones.<n>First, we developed a floor plan reconstruction dataset, FloorplanNet, specifically trained on architectural floor plans.<n> Secondly, we propose a floor plan reconstruction method based on Masked Autoencoders (MAE), which reconstructs missing parts by masking sections of the floor plan and training a lightweight Vision Transformer (ViT)
arXiv Detail & Related papers (2025-06-10T02:22:05Z)
HyperTree Planning: Enhancing LLM Reasoning via Hierarchical Thinking [109.09735490692202]
We propose HyperTree Planning (HTP), a novel reasoning paradigm that constructs hypertree-structured planning outlines for effective planning.<n> Experiments demonstrate the effectiveness of HTP, achieving state-of-the-art accuracy on the TravelPlanner benchmark with Gemini-1.5-Pro, resulting in a 3.6 times performance improvement over o1-preview.
arXiv Detail & Related papers (2025-05-05T02:38:58Z)
Latent Diffusion Planning for Imitation Learning [78.56207566743154]
Latent Diffusion Planning (LDP) is a modular approach consisting of a planner and inverse dynamics model.<n>By separating planning from action prediction, LDP can benefit from the denser supervision signals of suboptimal and action-free data.<n>On simulated visual robotic manipulation tasks, LDP outperforms state-of-the-art imitation learning approaches.
arXiv Detail & Related papers (2025-04-23T17:53:34Z)
HouseTune: Two-Stage Floorplan Generation with LLM Assistance [10.558141230827847]
This paper proposes a two-stage text-to-floorplan framework that combines the reasoning capability of Large Language Models with the generative power of diffusion models.<n> Experimental results demonstrate that our approach achieves state-of-the-art performance across all metrics, which validates its effectiveness in practical home design applications.
arXiv Detail & Related papers (2024-11-19T06:57:45Z)
Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty [56.30846158280031]
Task planning for embodied AI has been one of the most challenging problems. We propose a task-agnostic method named 'planning as in-painting' The proposed framework achieves promising performances in various embodied AI tasks.
arXiv Detail & Related papers (2023-12-02T10:07:17Z)
Compositional Foundation Models for Hierarchical Planning [52.18904315515153]
We propose a foundation model which leverages expert foundation model trained on language, vision and action data individually together to solve long-horizon tasks. We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model. Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos.
arXiv Detail & Related papers (2023-09-15T17:44:05Z)
End-to-end Graph-constrained Vectorized Floorplan Generation with Panoptic Refinement [16.103152098205566]
We aim to synthesize floorplans as sequences of 1-D vectors, which eases user interaction and design customization. In the first stage, we encode the room connectivity graph input by users with a graphal network (GCN), then apply an autoregressive transformer network to generate an initial floorplan sequence. To polish the initial design and generate more visually appealing floorplans, we further propose a novel panoptic refinement network(PRN) composed of a GCN and a transformer network.
arXiv Detail & Related papers (2022-07-27T03:19:20Z)
FloorGenT: Generative Vector Graphic Model of Floor Plans for Robotics [5.71097144710995]
We show that by modelling floor plans as sequences of line segments seen from a particular point of view, recent advances in autoregressive sequence modelling can be leveraged to model and predict floor plans.
arXiv Detail & Related papers (2022-03-07T13:42:48Z)
Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models. We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z)
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors [124.30562402952319]
The ability to predict and plan into the future is fundamental for agents acting in the world. Current learning approaches for visual prediction and planning fail on long-horizon tasks. We propose a framework for visual prediction and planning that is able to overcome both of these limitations.
arXiv Detail & Related papers (2020-06-23T17:58:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.