Related papers: PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation

URL: http://arxiv.org/abs/2505.07843v2
Date: Tue, 27 May 2025 02:41:23 GMT
Title: PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation
Authors: HsiaoYuan Hsu, Yuxin Peng,
Abstract summary: PosterO is a layout-centric approach to create posters for omnifarious purposes.<n>It structures layouts from datasets as trees in SVG language by universal shape, design intent vectorization, and hierarchical node representation.<n>It can generate visually appealing layouts for given images, achieving new state-of-the-art performance across various benchmarks.
Score: 38.53781264480452
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In poster design, content-aware layout generation is crucial for automatically arranging visual-textual elements on the given image. With limited training data, existing work focused on image-centric enhancement. However, this neglects the diversity of layouts and fails to cope with shape-variant elements or diverse design intents in generalized settings. To this end, we proposed a layout-centric approach that leverages layout knowledge implicit in large language models (LLMs) to create posters for omnifarious purposes, hence the name PosterO. Specifically, it structures layouts from datasets as trees in SVG language by universal shape, design intent vectorization, and hierarchical node representation. Then, it applies LLMs during inference to predict new layout trees by in-context learning with intent-aligned example selection. After layout trees are generated, we can seamlessly realize them into poster designs by editing the chat with LLMs. Extensive experimental results have demonstrated that PosterO can generate visually appealing layouts for given images, achieving new state-of-the-art performance across various benchmarks. To further explore PosterO's abilities under the generalized settings, we built PStylish7, the first dataset with multi-purpose posters and various-shaped elements, further offering a challenging test for advanced research.

Related papers

SciPostGen: Bridging the Gap between Scientific Papers and Poster Layouts [17.49687801784463]
Poster layouts determine how effectively research is communicated and understood, highlighting their growing importance.<n>To bridge this gap, we introduce SciPostGen, a large-scale dataset for understanding and generating poster layouts from scientific papers.<n>We explore a framework, Retrieval-Augmented Poster Layout Generation, which retrieves layouts consistent with a given paper and uses them as guidance for layout generation.
arXiv Detail & Related papers (2025-11-27T14:27:33Z)
POSTA: A Go-to Framework for Customized Artistic Poster Generation [87.16343612086959]
POSTA is a modular framework for customized artistic poster generation.<n>Background Diffusion creates a themed background based on user input.<n>Design MLLM then generates layout and typography elements that align with and complement the background style.<n>ArtText Diffusion applies additional stylization to key text elements.
arXiv Detail & Related papers (2025-03-19T05:22:38Z)
GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts [53.568057283934714]
We propose a VLM-based framework that generates content-aware text logo layouts. We introduce two model techniques to reduce the computation for processing multiple glyph images simultaneously. To support instruction-tuning of out model, we construct two extensive text logo datasets, which are 5x more larger than the existing public dataset.
arXiv Detail & Related papers (2024-11-18T10:04:10Z)
PosterLLaVa: Constructing a Unified Multi-modal Layout Generator with LLM [58.67882997399021]
Our research introduces a unified framework for automated graphic layout generation.<n>Our data-driven method employs structured text (JSON format) and visual instruction tuning to generate layouts.<n>We develop an automated text-to-poster system that generates editable posters based on users' design intentions.
arXiv Detail & Related papers (2024-06-05T03:05:52Z)
PosterLlama: Bridging Design Ability of Langauge Model to Contents-Aware Layout Generation [6.855409699832414]
PosterLlama is a network designed for generating visually and textually coherent layouts. Our evaluations demonstrate that PosterLlama outperforms existing methods in producing authentic and content-aware layouts. It supports an unparalleled range of conditions, including but not limited to unconditional layout generation, element conditional layout generation, layout completion, among others, serving as a highly versatile user manipulation tool.
arXiv Detail & Related papers (2024-04-01T08:46:35Z)
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models [98.81962282674151]
Large Language Models (LLMs) can serve as visual planners by generating layouts from text conditions. We propose LayoutGPT, a method to compose in-context visual demonstrations in style sheet language.
arXiv Detail & Related papers (2023-05-24T17:56:16Z)
PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements. We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers. A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z)
Geometry Aligned Variational Transformer for Image-conditioned Layout Generation [38.747175229902396]
We propose an Image-Conditioned Variational Transformer (ICVT) that autoregressively generates various layouts in an image. First, self-attention mechanism is adopted to model the contextual relationship within layout elements, while cross-attention mechanism is used to fuse the visual information of conditional images. We construct a large-scale advertisement poster layout designing dataset with delicate layout and saliency map annotations.
arXiv Detail & Related papers (2022-09-02T07:19:12Z)
Composition-aware Graphic Layout GAN for Visual-textual Presentation Designs [24.29890251913182]
We study the graphic layout generation problem of producing high-quality visual-textual presentation designs for given images. We propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGL-GAN), to synthesize layouts based on the global and spatial visual contents of input images.
arXiv Detail & Related papers (2022-04-30T16:42:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.