DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design
- URL: http://arxiv.org/abs/2407.15723v1
- Date: Mon, 22 Jul 2024 15:27:55 GMT
- Title: DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design
- Authors: Zhi Hao Luo, Luis Lara, Ge Ya Luo, Florian Golemo, Christopher Beckham, Christopher Pal,
- Abstract summary: We construct a new dataset for this data-structure to data-structure formulation of floorplan generation.
We explore the task of floorplan generation given a partial or complete set of constraints.
We demonstrate the feasibility of using floorplan data structure conditioned LLMs for the problem of floorplan generation respecting numerical constraints.
- Score: 5.567585193148804
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text conditioned generative models for images have yielded impressive results. Text conditioned floorplan generation as a special type of raster image generation task also received particular attention. However there are many use cases in floorpla generation where numerical properties of the generated result are more important than the aesthetics. For instance, one might want to specify sizes for certain rooms in a floorplan and compare the generated floorplan with given specifications Current approaches, datasets and commonly used evaluations do not support these kinds of constraints. As such, an attractive strategy is to generate an intermediate data structure that contains numerical properties of a floorplan which can be used to generate the final floorplan image. To explore this setting we (1) construct a new dataset for this data-structure to data-structure formulation of floorplan generation using two popular image based floorplan datasets RPLAN and ProcTHOR-10k, and provide the tools to convert further procedurally generated ProcTHOR floorplan data into our format. (2) We explore the task of floorplan generation given a partial or complete set of constraints and we design a series of metrics and benchmarks to enable evaluating how well samples generated from models respect the constraints. (3) We create multiple baselines by finetuning a large language model (LLM), Llama3, and demonstrate the feasibility of using floorplan data structure conditioned LLMs for the problem of floorplan generation respecting numerical constraints. We hope that our new datasets and benchmarks will encourage further research on different ways to improve the performance of LLMs and other generative modelling techniques for generating designs where quantitative constraints are only partially specified, but must be respected.
Related papers
- PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation.
Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts.
We conduct extensive experiments and achieved state-of-the-art (SOTA) performance on public multi-modal layout generation benchmarks.
arXiv Detail & Related papers (2024-06-05T03:05:52Z) - Learning to Plan and Generate Text with Citations [69.56850173097116]
We explore the attribution capabilities of plan-based models which have been recently shown to improve the faithfulness, grounding, and controllability of generated text.
We propose two attribution models that utilize different variants of blueprints, an abstractive model where questions are generated from scratch, and an extractive model where questions are copied from the input.
arXiv Detail & Related papers (2024-04-04T11:27:54Z) - Dynamic Retrieval-Augmented Generation [4.741884506444161]
We propose a novel approach for the Dynamic Retrieval-Augmented Generation (DRAG)
DRAG injects compressed embeddings of the retrieved entities into the generative model.
Our approach achieves several targets: (1) lifting the length limitations of the context window, saving on the prompt size; (2) allowing huge expansion of the number of retrieval entities available for the context; (3) alleviating the problem of misspelling or failing to find relevant entity names.
arXiv Detail & Related papers (2023-12-14T14:26:57Z) - Embodied Task Planning with Large Language Models [86.63533340293361]
We propose a TAsk Planing Agent (TaPA) in embodied tasks for grounded planning with physical scene constraint.
During inference, we discover the objects in the scene by extending open-vocabulary object detectors to multi-view RGB images collected in different achievable locations.
Experimental results show that the generated plan from our TaPA framework can achieve higher success rate than LLaVA and GPT-3.5 by a sizable margin.
arXiv Detail & Related papers (2023-07-04T17:58:25Z) - T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified
Visual Modalities [69.16656086708291]
Diffusion Probabilistic Field (DPF) models the distribution of continuous functions defined over metric spaces.
We propose a new model comprising of a view-wise sampling algorithm to focus on local structure learning.
The model can be scaled to generate high-resolution data while unifying multiple modalities.
arXiv Detail & Related papers (2023-05-24T03:32:03Z) - End-to-end Graph-constrained Vectorized Floorplan Generation with
Panoptic Refinement [16.103152098205566]
We aim to synthesize floorplans as sequences of 1-D vectors, which eases user interaction and design customization.
In the first stage, we encode the room connectivity graph input by users with a graphal network (GCN), then apply an autoregressive transformer network to generate an initial floorplan sequence.
To polish the initial design and generate more visually appealing floorplans, we further propose a novel panoptic refinement network(PRN) composed of a GCN and a transformer network.
arXiv Detail & Related papers (2022-07-27T03:19:20Z) - FloorGenT: Generative Vector Graphic Model of Floor Plans for Robotics [5.71097144710995]
We show that by modelling floor plans as sequences of line segments seen from a particular point of view, recent advances in autoregressive sequence modelling can be leveraged to model and predict floor plans.
arXiv Detail & Related papers (2022-03-07T13:42:48Z) - Data-to-text Generation with Variational Sequential Planning [74.3955521225497]
We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input.
We propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way.
We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation.
arXiv Detail & Related papers (2022-02-28T13:17:59Z) - Few-Shot Table-to-Text Generation with Prototype Memory [14.69889589370148]
We propose a new framework: Prototype-to-Generate (P2G), for table-to-text generation under the few-shot scenario.
The proposed framework utilizes the retrieved prototypes, which are jointly selected by an IR system and a novel prototype selector.
Experimental results on three benchmark datasets with three state-of-the-art models demonstrate that the proposed framework significantly improves the model performance.
arXiv Detail & Related papers (2021-08-27T22:16:30Z) - Data-to-text Generation with Macro Planning [61.265321323312286]
We propose a neural model with a macro planning stage followed by a generation stage reminiscent of traditional methods.
Our approach outperforms competitive baselines in terms of automatic and human evaluation.
arXiv Detail & Related papers (2021-02-04T16:32:57Z) - Graph-Based Generative Representation Learning of Semantically and
Behaviorally Augmented Floorplans [12.488287536032747]
We present a floorplan embedding technique that uses an attributed graph to represent the geometric information as well as design semantics and behavioral features of the inhabitants as node and edge attributes.
A Long Short-Term Memory (LSTM) Variational Autoencoder (VAE) architecture is proposed and trained to embed attributed graphs as vectors in a continuous space.
A user study is conducted to evaluate the coupling of similar floorplans retrieved from the embedding space with respect to a given input.
arXiv Detail & Related papers (2020-12-08T20:51:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.