Related papers: SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior

SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior

URL: http://arxiv.org/abs/2510.15749v1
Date: Fri, 17 Oct 2025 15:36:26 GMT
Title: SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior
Authors: Haoran Wang, Bo Zhao, Jinghui Wang, Hanzhang Wang, Huan Yang, Wei Ji, Hao Liu, Xinyan Xiao,
Abstract summary: We introduce SEGA, a novel Stepwise Evolution Paradigm for Content-Aware Layout Generation.<n>Inspired by the systematic mode of human thinking, SEGA employs a hierarchical reasoning framework with a coarse-to-fine strategy.<n>We present GenPoster-100K that is a new large-scale poster dataset with rich meta-information annotation.
Score: 30.770509440256973
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we study the content-aware layout generation problem, which aims to automatically generate layouts that are harmonious with a given background image. Existing methods usually deal with this task with a single-step reasoning framework. The lack of a feedback-based self-correction mechanism leads to their failure rates significantly increasing when faced with complex element layout planning. To address this challenge, we introduce SEGA, a novel Stepwise Evolution Paradigm for Content-Aware Layout Generation. Inspired by the systematic mode of human thinking, SEGA employs a hierarchical reasoning framework with a coarse-to-fine strategy: first, a coarse-level module roughly estimates the layout planning results; then, another refining module performs fine-level reasoning regarding the coarse planning results. Furthermore, we incorporate layout design principles as prior knowledge into the model to enhance its layout planning ability. Besides, we present GenPoster-100K that is a new large-scale poster dataset with rich meta-information annotation. The experiments demonstrate the effectiveness of our approach by achieving the state-of-the-art results on multiple benchmark datasets. Our project page is at: https://brucew91.github.io/SEGA.github.io/

Related papers

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation [38.53781264480452]
PosterO is a layout-centric approach to create posters for omnifarious purposes.<n>It structures layouts from datasets as trees in SVG language by universal shape, design intent vectorization, and hierarchical node representation.<n>It can generate visually appealing layouts for given images, achieving new state-of-the-art performance across various benchmarks.
arXiv Detail & Related papers (2025-05-06T18:42:24Z)
PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models [10.341382572198254]
We propose a unified layout planning and image generation model, PlanGen, which can pre-plan spatial layout conditions before generating images.<n>PlanGen integrates layout conditions into the model as context without requiring specialized encoding of local captions and bounding box coordinates.<n>In addition, PlanGen can be seamlessly expanded to layout-guided image manipulation thanks to the well-designed modeling.
arXiv Detail & Related papers (2025-03-13T07:37:09Z)
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation.<n>Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts.<n>We develop an automated text-to-poster system that generates editable posters based on users' design intentions.
arXiv Detail & Related papers (2024-06-05T03:05:52Z)
PosterLlama: Bridging Design Ability of Langauge Model to Contents-Aware Layout Generation [6.855409699832414]
PosterLlama is a network designed for generating visually and textually coherent layouts. Our evaluations demonstrate that PosterLlama outperforms existing methods in producing authentic and content-aware layouts. It supports an unparalleled range of conditions, including but not limited to unconditional layout generation, element conditional layout generation, layout completion, among others, serving as a highly versatile user manipulation tool.
arXiv Detail & Related papers (2024-04-01T08:46:35Z)
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model. We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z)
PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements. We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers. A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z)
LayoutDiffusion: Improving Graphic Layout Generation by Discrete Diffusion Probabilistic Models [50.73105631853759]
We present a novel generative model named LayoutDiffusion for automatic layout generation. It learns to reverse a mild forward process, in which layouts become increasingly chaotic with the growth of forward steps. It enables two conditional layout generation tasks in a plug-and-play manner without re-training and achieves better performance than existing methods.
arXiv Detail & Related papers (2023-03-21T04:41:02Z)
LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer [80.61492265221817]
Graphic layout designs play an essential role in visual communication. Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production. Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' desires.
arXiv Detail & Related papers (2022-12-19T21:57:35Z)
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding [52.3895498789521]
We propose ERNIE, a novel document pre-training solution with layout knowledge enhancement. We first rearrange input sequences in the serialization stage, then present a correlative pre-training task, reading order prediction, and learn the proper reading order of documents. Experimental results show ERNIE achieves superior performance on various downstream tasks, setting new state-of-the-art on key information, and document question answering.
arXiv Detail & Related papers (2022-10-12T12:59:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.