Relation-Aware Diffusion Model for Controllable Poster Layout Generation
- URL: http://arxiv.org/abs/2306.09086v2
- Date: Thu, 11 Jan 2024 08:46:37 GMT
- Title: Relation-Aware Diffusion Model for Controllable Poster Layout Generation
- Authors: Fengheng Li, An Liu, Wei Feng, Honghe Zhu, Yaoyu Li, Zheng Zhang,
Jingjing Lv, Xin Zhu, Junjie Shen, Zhangang Lin, Jingping Shao
- Abstract summary: Poster layout is a crucial aspect of poster design.
In this study, we introduce a relation-aware diffusion model for poster layout generation.
The proposed method can generate diverse layouts based on user constraints.
- Score: 19.65249380159006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Poster layout is a crucial aspect of poster design. Prior methods primarily
focus on the correlation between visual content and graphic elements. However,
a pleasant layout should also consider the relationship between visual and
textual contents and the relationship between elements. In this study, we
introduce a relation-aware diffusion model for poster layout generation that
incorporates these two relationships in the generation process. Firstly, we
devise a visual-textual relation-aware module that aligns the visual and
textual representations across modalities, thereby enhancing the layout's
efficacy in conveying textual information. Subsequently, we propose a geometry
relation-aware module that learns the geometry relationship between elements by
comprehensively considering contextual information. Additionally, the proposed
method can generate diverse layouts based on user constraints. To advance
research in this field, we have constructed a poster layout dataset named
CGL-Dataset V2. Our proposed method outperforms state-of-the-art methods on
CGL-Dataset V2. The data and code will be available at
https://github.com/liuan0803/RADM.
Related papers
- Relation Rectification in Diffusion Model [64.84686527988809]
We introduce a novel task termed Relation Rectification, aiming to refine the model to accurately represent a given relationship it initially fails to generate.
We propose an innovative solution utilizing a Heterogeneous Graph Convolutional Network (HGCN)
The lightweight HGCN adjusts the text embeddings generated by the text encoder, ensuring the accurate reflection of the textual relation in the embedding space.
arXiv Detail & Related papers (2024-03-29T15:54:36Z) - LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language
Models [84.16541551923221]
We propose a model that treats layout generation as a code generation task to enhance semantic information.
We develop a Code Instruct Tuning (CIT) approach comprising three interconnected modules.
We attain significant state-of-the-art performance on multiple datasets.
arXiv Detail & Related papers (2023-09-18T06:35:10Z) - A Parse-Then-Place Approach for Generating Graphic Layouts from Textual
Descriptions [50.469491454128246]
We use text as the guidance to create graphic layouts, i.e., Text-to-labeled, aiming to lower the design barriers.
Text-to-labeled is a challenging task, because it needs to consider the implicit, combined, and incomplete constraints from text.
We present a two-stage approach, named parse-then-place, to address this problem.
arXiv Detail & Related papers (2023-08-24T10:37:00Z) - Enhancing Visually-Rich Document Understanding via Layout Structure
Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model.
We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z) - PosterLayout: A New Benchmark and Approach for Content-aware
Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements.
We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers.
A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z) - Geometry Aligned Variational Transformer for Image-conditioned Layout
Generation [38.747175229902396]
We propose an Image-Conditioned Variational Transformer (ICVT) that autoregressively generates various layouts in an image.
First, self-attention mechanism is adopted to model the contextual relationship within layout elements, while cross-attention mechanism is used to fuse the visual information of conditional images.
We construct a large-scale advertisement poster layout designing dataset with delicate layout and saliency map annotations.
arXiv Detail & Related papers (2022-09-02T07:19:12Z) - VSR: A Unified Framework for Document Layout Analysis combining Vision,
Semantics and Relations [40.721146438291335]
We propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.
On three popular benchmarks, VSR outperforms previous models by large margins.
arXiv Detail & Related papers (2021-05-13T12:20:30Z) - LAMPRET: Layout-Aware Multimodal PreTraining for Document Understanding [17.179384053140236]
Document layout comprises both structural and visual (eg. font-sizes) information that is vital but often ignored by machine learning models.
We propose a novel layout-aware multimodal hierarchical framework, LAMPreT, to model the blocks and the whole document.
We evaluate the proposed model on two layout-aware tasks -- text block filling and image suggestion.
arXiv Detail & Related papers (2021-04-16T23:27:39Z) - Relational Message Passing for Knowledge Graph Completion [78.47976646383222]
We propose a relational message passing method for knowledge graph completion.
It passes relational messages among edges iteratively to aggregate neighborhood information.
Results show our method outperforms stateof-the-art knowledge completion methods by a large margin.
arXiv Detail & Related papers (2020-02-17T03:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.