LayoutDM: Transformer-based Diffusion Model for Layout Generation
- URL: http://arxiv.org/abs/2305.02567v1
- Date: Thu, 4 May 2023 05:51:35 GMT
- Title: LayoutDM: Transformer-based Diffusion Model for Layout Generation
- Authors: Shang Chai and Liansheng Zhuang and Fengying Yan
- Abstract summary: Transformer-based diffusion model (DDPM) is proposed to generate high-quality images.
Transformer-based conditional Layout Denoiser is proposed to generate samples from noised layout data.
Our method outperforms state-of-the-art generative models in terms of quality and diversity.
- Score: 0.6445605125467572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic layout generation that can synthesize high-quality layouts is an
important tool for graphic design in many applications. Though existing methods
based on generative models such as Generative Adversarial Networks (GANs) and
Variational Auto-Encoders (VAEs) have progressed, they still leave much room
for improving the quality and diversity of the results. Inspired by the recent
success of diffusion models in generating high-quality images, this paper
explores their potential for conditional layout generation and proposes
Transformer-based Layout Diffusion Model (LayoutDM) by instantiating the
conditional denoising diffusion probabilistic model (DDPM) with a purely
transformer-based architecture. Instead of using convolutional neural networks,
a transformer-based conditional Layout Denoiser is proposed to learn the
reverse diffusion process to generate samples from noised layout data.
Benefitting from both transformer and DDPM, our LayoutDM is of desired
properties such as high-quality generation, strong sample diversity, faithful
distribution coverage, and stationary training in comparison to GANs and VAEs.
Quantitative and qualitative experimental results show that our method
outperforms state-of-the-art generative models in terms of quality and
diversity.
Related papers
- Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance [0.0]
diffusion models are still challenged by model-induced artifacts and limited stability in image fidelity.
We propose the integration of alias-free resampling layers into the UNet architecture of diffusion models.
Our experimental results on benchmark datasets, including CIFAR-10, MNIST, and MNIST-M, reveal consistent gains in image quality.
arXiv Detail & Related papers (2024-11-14T04:23:28Z) - Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis [62.06970466554273]
We present Meissonic, which non-autoregressive masked image modeling (MIM) text-to-image elevates to a level comparable with state-of-the-art diffusion models like SDXL.
We leverage high-quality training data, integrate micro-conditions informed by human preference scores, and employ feature compression layers to further enhance image fidelity and resolution.
Our model not only matches but often exceeds the performance of existing models like SDXL in generating high-quality, high-resolution images.
arXiv Detail & Related papers (2024-10-10T17:59:17Z) - Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations.
We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs)
EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding.
Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z) - Improving Out-of-Distribution Robustness of Classifiers via Generative
Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data.
However, their performance deteriorates significantly when handling out-of-distribution (OoD) data.
We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z) - Real-World Image Variation by Aligning Diffusion Inversion Chain [53.772004619296794]
A domain gap exists between generated images and real-world images, which poses a challenge in generating high-quality variations of real-world images.
We propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL)
Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
arXiv Detail & Related papers (2023-05-30T04:09:47Z) - Unifying Layout Generation with a Decoupled Diffusion Model [26.659337441975143]
It is a crucial task for reducing the burden on heavy-duty graphic design works for formatted scenes, e.g., publications, documents, and user interfaces (UIs)
We propose a layout Diffusion Generative Model (LDGM) to achieve such unification with a single decoupled diffusion model.
Our proposed LDGM can generate layouts either from scratch or conditional on arbitrary available attributes.
arXiv Detail & Related papers (2023-03-09T05:53:32Z) - LayoutDiffuse: Adapting Foundational Diffusion Models for
Layout-to-Image Generation [24.694298869398033]
Our method trains efficiently, generates images with both high perceptual quality and layout alignment.
Our method significantly outperforms other 10 generative models based on GANs, VQ-VAE, and diffusion models.
arXiv Detail & Related papers (2023-02-16T14:20:25Z) - f-DM: A Multi-stage Diffusion Model via Progressive Signal
Transformation [56.04628143914542]
Diffusion models (DMs) have recently emerged as SoTA tools for generative modeling in various domains.
We propose f-DM, a generalized family of DMs which allows progressive signal transformation.
We apply f-DM in image generation tasks with a range of functions, including down-sampling, blurring, and learned transformations.
arXiv Detail & Related papers (2022-10-10T18:49:25Z) - DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from
Low-Dimensional Latents [26.17940552906923]
We present DiffuseVAE, a novel generative framework that integrates VAE within a diffusion model framework.
We show that the proposed model can generate high-resolution samples and exhibits quality comparable to state-of-the-art models on standard benchmarks.
arXiv Detail & Related papers (2022-01-02T06:44:23Z) - Generating Multivariate Load States Using a Conditional Variational
Autoencoder [11.557259513691239]
A conditional variational autoencoder (CVAE) neural network is proposed in this paper.
The model includes latent variation of output samples under given latent vectors and co-optimizes the parameters for this output variability.
Experiments demonstrate that the proposed generator outperforms other data generating mechanisms.
arXiv Detail & Related papers (2021-10-21T19:07:04Z) - Deep Variational Models for Collaborative Filtering-based Recommender
Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results.
Our proposed models apply the variational concept to injectity in the latent space of the deep architecture.
Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.