Related papers: Composition-aware Graphic Layout GAN for Visual-textual Presentation Designs

Composition-aware Graphic Layout GAN for Visual-textual Presentation Designs

URL: http://arxiv.org/abs/2205.00303v1
Date: Sat, 30 Apr 2022 16:42:13 GMT
Title: Composition-aware Graphic Layout GAN for Visual-textual Presentation Designs
Authors: Min Zhou, Chenchen Xu, Ye Ma, Tiezheng Ge, Yuning Jiang and Weiwei Xu
Abstract summary: We study the graphic layout generation problem of producing high-quality visual-textual presentation designs for given images. We propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGL-GAN), to synthesize layouts based on the global and spatial visual contents of input images.
Score: 24.29890251913182
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study the graphic layout generation problem of producing high-quality visual-textual presentation designs for given images. We note that image compositions, which contain not only global semantics but also spatial information, would largely affect layout results. Hence, we propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGL-GAN), to synthesize layouts based on the global and spatial visual contents of input images. To obtain training images from images that already contain manually designed graphic layout data, previous work suggests masking design elements (e.g., texts and embellishments) as model inputs, which inevitably leaves hint of the ground truth. We study the misalignment between the training inputs (with hint masks) and test inputs (without masks), and design a novel domain alignment module (DAM) to narrow this gap. For training, we built a large-scale layout dataset which consists of 60,548 advertising posters with annotated layout information. To evaluate the generated layouts, we propose three novel metrics according to aesthetic intuitions. Through both quantitative and qualitative evaluations, we demonstrate that the proposed model can synthesize high-quality graphic layouts according to image compositions.

Related papers

IGD: Instructional Graphic Design with Multimodal Layer Generation [83.31320209596991]
Two-stage methods that rely primarily on layout generation lack creativity and intelligence, making graphic design still labor-intensive.<n>We propose instructional graphic designer (IGD) to swiftly generate multimodal layers with editable flexibility with only natural language instructions.
arXiv Detail & Related papers (2025-07-14T04:31:15Z)
Rethinking Layered Graphic Design Generation with a Top-Down Approach [76.33538798060326]
Graphic design is crucial for conveying ideas and messages. Designers usually organize their work into objects, backgrounds, and vectorized text layers to simplify editing.<n>With the rise of GenAI methods, an endless supply of high-quality graphic designs in pixel format has become more accessible.<n>Despite this, non-layered designs still inspire human designers, influencing their choices in layouts and text styles, ultimately guiding the creation of layered designs.<n>Motivated by this observation, we propose Accordion, a graphic design generation framework taking the first attempt to convert AI-generated designs into editable layered designs.
arXiv Detail & Related papers (2025-07-08T02:26:08Z)
GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts [53.568057283934714]
We propose a VLM-based framework that generates content-aware text logo layouts. We introduce two model techniques to reduce the computation for processing multiple glyph images simultaneously. To support instruction-tuning of out model, we construct two extensive text logo datasets, which are 5x more larger than the existing public dataset.
arXiv Detail & Related papers (2024-11-18T10:04:10Z)
Self-supervised Photographic Image Layout Representation Learning [5.009120058742792]
We develop an autoencoder-based network architecture skilled in compressing heterogeneous layout graphs into precise, dimensionally-reduced layout representations. We introduce the LODB dataset, which features a broader range of layout categories and richer semantics. Our extensive experimentation on this dataset demonstrates the superior performance of our approach in the realm of photographic image layout representation learning.
arXiv Detail & Related papers (2024-03-06T14:28:53Z)
Dense Text-to-Image Generation with Attention Modulation [49.287458275920514]
Existing text-to-image diffusion models struggle to synthesize realistic images given dense captions. We propose DenseDiffusion, a training-free method that adapts a pre-trained text-to-image model to handle such dense captions. We achieve similar-quality visual results with models specifically trained with layout conditions.
arXiv Detail & Related papers (2023-08-24T17:59:01Z)
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model. We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z)
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models [98.81962282674151]
Large Language Models (LLMs) can serve as visual planners by generating layouts from text conditions. We propose LayoutGPT, a method to compose in-context visual demonstrations in style sheet language.
arXiv Detail & Related papers (2023-05-24T17:56:16Z)
PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements. We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers. A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z)
Unsupervised Domain Adaption with Pixel-level Discriminator for Image-aware Layout Generation [24.625282719753915]
This paper focuses on using the GAN-based model conditioned on image contents to generate advertising poster graphic layouts. It combines unsupervised domain techniques to design a GAN with a novel pixel-level discriminator (PD), called PDA-GAN, to generate graphic layouts according to image contents. Both quantitative and qualitative evaluations demonstrate that PDA-GAN can achieve state-of-the-art performances.
arXiv Detail & Related papers (2023-03-25T06:50:22Z)
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer [80.61492265221817]
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' desires.
arXiv Detail & Related papers (2022-12-19T21:57:35Z)
Geometry Aligned Variational Transformer for Image-conditioned Layout Generation [38.747175229902396]
We propose an Image-Conditioned Variational Transformer (ICVT) that autoregressively generates various layouts in an image. First, self-attention mechanism is adopted to model the contextual relationship within layout elements, while cross-attention mechanism is used to fuse the visual information of conditional images. We construct a large-scale advertisement poster layout designing dataset with delicate layout and saliency map annotations.
arXiv Detail & Related papers (2022-09-02T07:19:12Z)
Interactive Image Synthesis with Panoptic Layout Generation [14.1026819862002]
We propose Panoptic Layout Generative Adversarial Networks (PLGAN) to address this challenge. PLGAN employs panoptic theory which distinguishes object categories between "stuff" with amorphous boundaries and "things" with well-defined shapes. We experimentally compare our PLGAN with state-of-the-art layout-based models on the COCO-Stuff, Visual Genome, and Landscape datasets.
arXiv Detail & Related papers (2022-03-04T02:45:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.