LayoutDM: Transformer-based Diffusion Model for Layout Generation
- URL: http://arxiv.org/abs/2305.02567v1
- Date: Thu, 4 May 2023 05:51:35 GMT
- Title: LayoutDM: Transformer-based Diffusion Model for Layout Generation
- Authors: Shang Chai and Liansheng Zhuang and Fengying Yan
- Abstract summary: Transformer-based diffusion model (DDPM) is proposed to generate high-quality images.
Transformer-based conditional Layout Denoiser is proposed to generate samples from noised layout data.
Our method outperforms state-of-the-art generative models in terms of quality and diversity.
- Score: 0.6445605125467572
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic layout generation that can synthesize high-quality layouts is an
important tool for graphic design in many applications. Though existing methods
based on generative models such as Generative Adversarial Networks (GANs) and
Variational Auto-Encoders (VAEs) have progressed, they still leave much room
for improving the quality and diversity of the results. Inspired by the recent
success of diffusion models in generating high-quality images, this paper
explores their potential for conditional layout generation and proposes
Transformer-based Layout Diffusion Model (LayoutDM) by instantiating the
conditional denoising diffusion probabilistic model (DDPM) with a purely
transformer-based architecture. Instead of using convolutional neural networks,
a transformer-based conditional Layout Denoiser is proposed to learn the
reverse diffusion process to generate samples from noised layout data.
Benefitting from both transformer and DDPM, our LayoutDM is of desired
properties such as high-quality generation, strong sample diversity, faithful
distribution coverage, and stationary training in comparison to GANs and VAEs.
Quantitative and qualitative experimental results show that our method
outperforms state-of-the-art generative models in terms of quality and
diversity.
Related papers
- Advancing Diffusion Models: Alias-Free Resampling and Enhanced Rotational Equivariance [0.0]
diffusion models are still challenged by model-induced artifacts and limited stability in image fidelity.
We propose the integration of alias-free resampling layers into the UNet architecture of diffusion models.
Our experimental results on benchmark datasets, including CIFAR-10, MNIST, and MNIST-M, reveal consistent gains in image quality.
arXiv Detail & Related papers (2024-11-14T04:23:28Z) - Variational Search Distributions [16.609027794680213]
We develop variational search (VSD) for finding and generating discrete, designs of a rare desired class in a batch sequential manner.
We empirically demonstrate that VSD can outperform existing baseline methods on a set of real sequence-design problems in various biological systems.
arXiv Detail & Related papers (2024-09-10T01:33:31Z) - TerDiT: Ternary Diffusion Models with Transformers [83.94829676057692]
TerDiT is a quantization-aware training scheme for ternary diffusion models with transformers.
We focus on the ternarization of DiT networks and scale model sizes from 600M to 4.2B.
arXiv Detail & Related papers (2024-05-23T17:57:24Z) - Unified Generation, Reconstruction, and Representation: Generalized Diffusion with Adaptive Latent Encoding-Decoding [90.77521413857448]
Deep generative models are anchored in three core capabilities -- generating new instances, reconstructing inputs, and learning compact representations.
We introduce Generalized generative adversarial-Decoding Diffusion Probabilistic Models (EDDPMs)
EDDPMs generalize the Gaussian noising-denoising in standard diffusion by introducing parameterized encoding-decoding.
Experiments on text, proteins, and images demonstrate the flexibility to handle diverse data and tasks.
arXiv Detail & Related papers (2024-02-29T10:08:57Z) - Improving Out-of-Distribution Robustness of Classifiers via Generative
Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data.
However, their performance deteriorates significantly when handling out-of-distribution (OoD) data.
We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z) - Real-World Image Variation by Aligning Diffusion Inversion Chain [53.772004619296794]
A domain gap exists between generated images and real-world images, which poses a challenge in generating high-quality variations of real-world images.
We propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL)
Our pipeline enhances the generation quality of image variations by aligning the image generation process to the source image's inversion chain.
arXiv Detail & Related papers (2023-05-30T04:09:47Z) - Unifying Layout Generation with a Decoupled Diffusion Model [26.659337441975143]
It is a crucial task for reducing the burden on heavy-duty graphic design works for formatted scenes, e.g., publications, documents, and user interfaces (UIs)
We propose a layout Diffusion Generative Model (LDGM) to achieve such unification with a single decoupled diffusion model.
Our proposed LDGM can generate layouts either from scratch or conditional on arbitrary available attributes.
arXiv Detail & Related papers (2023-03-09T05:53:32Z) - LayoutDiffuse: Adapting Foundational Diffusion Models for
Layout-to-Image Generation [24.694298869398033]
Our method trains efficiently, generates images with both high perceptual quality and layout alignment.
Our method significantly outperforms other 10 generative models based on GANs, VQ-VAE, and diffusion models.
arXiv Detail & Related papers (2023-02-16T14:20:25Z) - f-DM: A Multi-stage Diffusion Model via Progressive Signal
Transformation [56.04628143914542]
Diffusion models (DMs) have recently emerged as SoTA tools for generative modeling in various domains.
We propose f-DM, a generalized family of DMs which allows progressive signal transformation.
We apply f-DM in image generation tasks with a range of functions, including down-sampling, blurring, and learned transformations.
arXiv Detail & Related papers (2022-10-10T18:49:25Z) - DiffuseVAE: Efficient, Controllable and High-Fidelity Generation from
Low-Dimensional Latents [26.17940552906923]
We present DiffuseVAE, a novel generative framework that integrates VAE within a diffusion model framework.
We show that the proposed model can generate high-resolution samples and exhibits quality comparable to state-of-the-art models on standard benchmarks.
arXiv Detail & Related papers (2022-01-02T06:44:23Z) - Deep Variational Models for Collaborative Filtering-based Recommender
Systems [63.995130144110156]
Deep learning provides accurate collaborative filtering models to improve recommender system results.
Our proposed models apply the variational concept to injectity in the latent space of the deep architecture.
Results show the superiority of the proposed approach in scenarios where the variational enrichment exceeds the injected noise effect.
arXiv Detail & Related papers (2021-07-27T08:59:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.