Related papers: LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation

URL: http://arxiv.org/abs/2303.08137v1
Date: Tue, 14 Mar 2023 17:59:47 GMT
Title: LayoutDM: Discrete Diffusion Model for Controllable Layout Generation
Authors: Naoto Inoue, Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, Kota Yamaguchi
Abstract summary: Controllable layout generation aims at synthesizing plausible arrangement of element bounding boxes with optional constraints. In this work, we try to solve a broad range of layout generation tasks in a single model that is based on discrete state-space diffusion models. Our model, named LayoutDM, naturally handles the structured layout data in the discrete representation and learns to progressively infer a noiseless layout from the initial input.
Score: 27.955214767628107
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Controllable layout generation aims at synthesizing plausible arrangement of element bounding boxes with optional constraints, such as type or position of a specific element. In this work, we try to solve a broad range of layout generation tasks in a single model that is based on discrete state-space diffusion models. Our model, named LayoutDM, naturally handles the structured layout data in the discrete representation and learns to progressively infer a noiseless layout from the initial input, where we model the layout corruption process by modality-wise discrete diffusion. For conditional generation, we propose to inject layout constraints in the form of masking or logit adjustment during inference. We show in the experiments that our LayoutDM successfully generates high-quality layouts and outperforms both task-specific and task-agnostic baselines on several layout tasks.

Related papers

Be Decisive: Noise-Induced Layouts for Multi-Subject Generation [56.80513553424086]
Complex prompts lead to subject leakage, causing inaccuracies in quantities, attributes, and visual features.<n>We introduce a new approach that predicts a spatial layout aligned with the prompt, derived from the initial noise, and refines it throughout the denoising process.<n>Our method employs a small neural network to predict and refine the evolving noise-induced layout at each denoising step.
arXiv Detail & Related papers (2025-05-27T17:54:24Z)
Create Anything Anywhere: Layout-Controllable Personalized Diffusion Model for Multiple Subjects [13.980211126764349]
LCP-Diffusion is a novel framework that integrates subject identity preservation with flexible layout guidance in a tuning-free approach.<n>Experiments validate that LCP-Diffusion excels in both identity preservation and layout controllability.
arXiv Detail & Related papers (2025-05-27T08:57:07Z)
CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation [78.21134311493303]
Diffusion models have been recognized for their ability to generate images that are not only visually appealing but also of high artistic quality. Previous methods primarily focus on UNet-based models (e.g., SD1.5 and SDXL), and limited effort has explored Multimodal Diffusion Transformers (MM-DiTs) Inherit the advantages of MM-DiT, we use a separate set network weights to process the image and text modalities. We contribute a large-scale layout dataset, named LayoutSAM, which includes 2.7 million image-text pairs and 10.7 million entities.
arXiv Detail & Related papers (2024-12-05T04:09:47Z)
Layout-Corrector: Alleviating Layout Sticking Phenomenon in Discrete Diffusion Model [3.8748565070264753]
We present a learning-based module capable of identifying inharmonious elements within layouts, considering overall layout harmony. The module consistently boosts layout-generation performance when in conjunction with various state-of-the-art DDMs.
arXiv Detail & Related papers (2024-09-25T07:24:43Z)
LayoutDiT: Exploring Content-Graphic Balance in Layout Generation with Diffusion Transformer [46.67415676699221]
We introduce a framework that balances content and graphic features to generate high-quality, visually appealing layouts. Specifically, we design an adaptive factor that optimize the model's awareness of the layout generation space. We also introduce a graphic condition, the saliency bounding box, to bridge the modality difference between images in the visual domain and layouts in the geometric parameter domain.
arXiv Detail & Related papers (2024-07-21T17:58:21Z)
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation. Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts. We conduct extensive experiments and achieved state-of-the-art (SOTA) performance on public multi-modal layout generation benchmarks.
arXiv Detail & Related papers (2024-06-05T03:05:52Z)
Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints [53.66698106829144]
We propose a unified model to handle a broad range of layout generation tasks. The model is based on continuous diffusion models. Experiment results show that LACE produces high-quality layouts.
arXiv Detail & Related papers (2024-02-07T11:12:41Z)
LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation [46.567682868550285]
We propose a diffusion model named LayoutDiffusion that can obtain higher generation quality and greater controllability than the previous works. In this paper, we propose to construct a structural image patch with region information and transform the patched image into a special layout to fuse with the normal layout in a unified form. Our experiments show that our LayoutDiffusion outperforms the previous SOTA methods on FID, CAS by relatively 46.35%, 26.70% on COCO-stuff and 44.29%, 41.82% on VG Code.
arXiv Detail & Related papers (2023-03-30T06:56:12Z)
LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models [50.73105631853759]
We present a novel generative model named LayoutDiffusion for automatic layout generation. It learns to reverse a mild forward process, in which layouts become increasingly chaotic with the growth of forward steps. It enables two conditional layout generation tasks in a plug-and-play manner without re-training and achieves better performance than existing methods.
arXiv Detail & Related papers (2023-03-21T04:41:02Z)
Unifying Layout Generation with a Decoupled Diffusion Model [26.659337441975143]
It is a crucial task for reducing the burden on heavy-duty graphic design works for formatted scenes, e.g., publications, documents, and user interfaces (UIs) We propose a layout Diffusion Generative Model (LDGM) to achieve such unification with a single decoupled diffusion model. Our proposed LDGM can generate layouts either from scratch or conditional on arbitrary available attributes.
arXiv Detail & Related papers (2023-03-09T05:53:32Z)
DLT: Conditioned layout generation with Joint Discrete-Continuous Diffusion Layout Transformer [2.0483033421034142]
We introduce DLT, a joint discrete-continuous diffusion model. DLT has a flexible conditioning mechanism that allows for conditioning on any given subset of all the layout component classes, locations, and sizes. Our method outperforms state-of-the-art generative models on various layout generation datasets with respect to different metrics and conditioning settings.
arXiv Detail & Related papers (2023-03-07T09:30:43Z)
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer [80.61492265221817]
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' desires.
arXiv Detail & Related papers (2022-12-19T21:57:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.