Tell2Design: A Dataset for Language-Guided Floor Plan Generation
- URL: http://arxiv.org/abs/2311.15941v1
- Date: Mon, 27 Nov 2023 15:49:29 GMT
- Title: Tell2Design: A Dataset for Language-Guided Floor Plan Generation
- Authors: Sicong Leng, Yang Zhou, Mohammed Haroon Dupty, Wee Sun Lee, Sam Conrad
Joyce, Wei Lu
- Abstract summary: We consider the task of generating designs directly from natural language descriptions.
Designs must satisfy different constraints that are not present in generating artistic images.
- Score: 21.686370988228614
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the task of generating designs directly from natural language
descriptions, and consider floor plan generation as the initial research area.
Language conditional generative models have recently been very successful in
generating high-quality artistic images. However, designs must satisfy
different constraints that are not present in generating artistic images,
particularly spatial and relational constraints. We make multiple contributions
to initiate research on this task. First, we introduce a novel dataset,
\textit{Tell2Design} (T2D), which contains more than $80k$ floor plan designs
associated with natural language instructions. Second, we propose a
Sequence-to-Sequence model that can serve as a strong baseline for future
research. Third, we benchmark this task with several text-conditional image
generation models. We conclude by conducting human evaluations on the generated
samples and providing an analysis of human performance. We hope our
contributions will propel the research on language-guided design generation
forward.
Related papers
- Visual Storytelling with Question-Answer Plans [70.89011289754863]
We present a novel framework which integrates visual representations with pretrained language models and planning.
Our model translates the image sequence into a visual prefix, a sequence of continuous embeddings which language models can interpret.
It also leverages a sequence of question-answer pairs as a blueprint plan for selecting salient visual concepts and determining how they should be assembled into a narrative.
arXiv Detail & Related papers (2023-10-08T21:45:34Z) - Visual Programming for Text-to-Image Generation and Evaluation [73.12069620086311]
We propose two novel interpretable/explainable visual programming frameworks for text-to-image (T2I) generation and evaluation.
First, we introduce VPGen, an interpretable step-by-step T2I generation framework that decomposes T2I generation into three steps: object/count generation, layout generation, and image generation.
Second, we introduce VPEval, an interpretable and explainable evaluation framework for T2I generation based on visual programming.
arXiv Detail & Related papers (2023-05-24T16:42:17Z) - MOCHA: A Multi-Task Training Approach for Coherent Text Generation from
Cognitive Perspective [22.69509556890676]
We propose a novel multi-task training strategy for coherent text generation grounded on the cognitive theory of writing.
We extensively evaluate our model on three open-ended generation tasks including story generation, news article writing and argument generation.
arXiv Detail & Related papers (2022-10-26T11:55:41Z) - On Advances in Text Generation from Images Beyond Captioning: A Case
Study in Self-Rationalization [89.94078728495423]
We show that recent advances in each modality, CLIP image representations and scaling of language models, do not consistently improve multimodal self-rationalization of tasks with multimodal inputs.
Our findings call for a backbone modelling approach that can be built on to advance text generation from images and text beyond image captioning.
arXiv Detail & Related papers (2022-05-24T00:52:40Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - ERNIE-ViLG: Unified Generative Pre-training for Bidirectional
Vision-Language Generation [22.47279425592133]
We propose ERNIE-ViLG, a unified generative pre-training framework for bidirectional image-text generation.
For the text-to-image generation process, we propose an end-to-end training method to jointly learn the visual sequence generator and the image reconstructor.
We train a 10-billion parameter ERNIE-ViLG model on a large-scale dataset of 145 million (Chinese) image-text pairs.
arXiv Detail & Related papers (2021-12-31T03:53:33Z) - DYPLOC: Dynamic Planning of Content Using Mixed Language Models for Text
Generation [10.477090501569284]
We study the task of long-form opinion text generation, which faces at least two distinct challenges.
Existing neural generation models fall short of coherence, thus requiring efficient content planning.
We propose DYPLOC, a generation framework that conducts dynamic planning of content while generating the output based on a novel design of mixed language models.
arXiv Detail & Related papers (2021-06-01T20:56:10Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Intelligent Home 3D: Automatic 3D-House Design from Linguistic
Descriptions Only [55.3363844662966]
We formulate it as a language conditioned visual content generation problem that is divided into a floor plan generation and an interior texture synthesis task.
To train and evaluate our model, we build the first Text-to-3D House Model dataset.
arXiv Detail & Related papers (2020-03-01T04:28:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.