POSTA: A Go-to Framework for Customized Artistic Poster Generation
- URL: http://arxiv.org/abs/2503.14908v1
- Date: Wed, 19 Mar 2025 05:22:38 GMT
- Title: POSTA: A Go-to Framework for Customized Artistic Poster Generation
- Authors: Haoyu Chen, Xiaojie Xu, Wenbo Li, Jingjing Ren, Tian Ye, Songhua Liu, Ying-Cong Chen, Lei Zhu, Xinchao Wang,
- Abstract summary: POSTA is a modular framework for customized artistic poster generation.<n>Background Diffusion creates a themed background based on user input.<n>Design MLLM then generates layout and typography elements that align with and complement the background style.<n>ArtText Diffusion applies additional stylization to key text elements.
- Score: 87.16343612086959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Poster design is a critical medium for visual communication. Prior work has explored automatic poster design using deep learning techniques, but these approaches lack text accuracy, user customization, and aesthetic appeal, limiting their applicability in artistic domains such as movies and exhibitions, where both clear content delivery and visual impact are essential. To address these limitations, we present POSTA: a modular framework powered by diffusion models and multimodal large language models (MLLMs) for customized artistic poster generation. The framework consists of three modules. Background Diffusion creates a themed background based on user input. Design MLLM then generates layout and typography elements that align with and complement the background style. Finally, to enhance the poster's aesthetic appeal, ArtText Diffusion applies additional stylization to key text elements. The final result is a visually cohesive and appealing poster, with a fully modular process that allows for complete customization. To train our models, we develop the PosterArt dataset, comprising high-quality artistic posters annotated with layout, typography, and pixel-level stylized text segmentation. Our comprehensive experimental analysis demonstrates POSTA's exceptional controllability and design diversity, outperforming existing models in both text accuracy and aesthetic quality.
Related papers
- PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering [50.76106125697899]
Product posters, which integrate subject, scene, and text, are crucial promotional tools for attracting customers.
Main challenge lies in accurately rendering text, especially for complex writing systems like Chinese, which contains over 10,000 individual characters.
We develop TextRenderNet, which achieves a high text rendering accuracy of over 90%.
Based on TextRenderNet and SceneGenNet, we present PosterMaker, an end-to-end generation framework.
arXiv Detail & Related papers (2025-04-09T07:13:08Z) - Compose Your Aesthetics: Empowering Text-to-Image Models with the Principles of Art [61.28133495240179]
We propose a novel task of aesthetics alignment which seeks to align user-specified aesthetics with the T2I generation output.<n>Inspired by how artworks provide an invaluable perspective to approach aesthetics, we codify visual aesthetics using the compositional framework artists employ.<n>We demonstrate that T2I DMs can effectively offer 10 compositional controls through user-specified PoA conditions.
arXiv Detail & Related papers (2025-03-15T06:58:09Z) - GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts [53.568057283934714]
We propose a VLM-based framework that generates content-aware text logo layouts.
We introduce two model techniques to reduce the computation for processing multiple glyph images simultaneously.
To support instruction-tuning of out model, we construct two extensive text logo datasets, which are 5x more larger than the existing public dataset.
arXiv Detail & Related papers (2024-11-18T10:04:10Z) - GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models [7.152732507491591]
We propose an automatic poster generation framework with text rendering capabilities leveraging LLMs.<n>This framework aims to create precise poster text within a detailed contextual background.<n>We introduce a high-resolution font dataset and a poster dataset with resolutions exceeding 1024 pixels.
arXiv Detail & Related papers (2024-07-02T13:17:49Z) - VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model [28.345828491336874]
This work presents a novel image outpainting framework that is capable of customizing the results according to the requirement of users.<n>We take advantage of a Multimodal Large Language Model (MLLM) that automatically extracts and organizes the corresponding textual descriptions of the masked and unmasked part of a given image.<n>In addition, a special Cross-Attention module, namely Center-Total-Surrounding (CTS), is elaborately designed to enhance further the the interaction between specific space regions of the image and corresponding parts of the text prompts.
arXiv Detail & Related papers (2024-06-03T07:14:19Z) - PosterLlama: Bridging Design Ability of Langauge Model to Contents-Aware Layout Generation [6.855409699832414]
PosterLlama is a network designed for generating visually and textually coherent layouts.
Our evaluations demonstrate that PosterLlama outperforms existing methods in producing authentic and content-aware layouts.
It supports an unparalleled range of conditions, including but not limited to unconditional layout generation, element conditional layout generation, layout completion, among others, serving as a highly versatile user manipulation tool.
arXiv Detail & Related papers (2024-04-01T08:46:35Z) - WordArt Designer: User-Driven Artistic Typography Synthesis using Large
Language Models [43.68826200853858]
This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis.
The system incorporates four key modules: the LLM Engine, SemTypo, StyTypo, and TexTypo modules.
Notably, WordArt Designer highlights the fusion of generative AI with artistic typography.
arXiv Detail & Related papers (2023-10-20T12:44:44Z) - TextPainter: Multimodal Text Image Generation with Visual-harmony and
Text-comprehension for Poster Design [50.8682912032406]
This study introduces TextPainter, a novel multimodal approach to generate text images.
TextPainter takes the global-local background image as a hint of style and guides the text image generation with visual harmony.
We construct the PosterT80K dataset, consisting of about 80K posters annotated with sentence-level bounding boxes and text contents.
arXiv Detail & Related papers (2023-08-09T06:59:29Z) - PosterLayout: A New Benchmark and Approach for Content-aware
Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements.
We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers.
A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z) - GenText: Unsupervised Artistic Text Generation via Decoupled Font and
Texture Manipulation [30.654807125764965]
We propose a novel approach, namely GenText, to achieve general artistic text style transfer.
Specifically, our work incorporates three different stages, stylization, destylization, and font transfer.
Considering the difficult data acquisition of paired artistic text images, our model is designed under the unsupervised setting.
arXiv Detail & Related papers (2022-07-20T04:42:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.