PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
- URL: http://arxiv.org/abs/2504.06632v1
- Date: Wed, 09 Apr 2025 07:13:08 GMT
- Title: PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering
- Authors: Yifan Gao, Zihang Lin, Chuanbin Liu, Min Zhou, Tiezheng Ge, Bo Zheng, Hongtao Xie,
- Abstract summary: Product posters, which integrate subject, scene, and text, are crucial promotional tools for attracting customers.<n>Main challenge lies in accurately rendering text, especially for complex writing systems like Chinese, which contains over 10,000 individual characters.<n>We develop TextRenderNet, which achieves a high text rendering accuracy of over 90%.<n>Based on TextRenderNet and SceneGenNet, we present PosterMaker, an end-to-end generation framework.
- Score: 50.76106125697899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Product posters, which integrate subject, scene, and text, are crucial promotional tools for attracting customers. Creating such posters using modern image generation methods is valuable, while the main challenge lies in accurately rendering text, especially for complex writing systems like Chinese, which contains over 10,000 individual characters. In this work, we identify the key to precise text rendering as constructing a character-discriminative visual feature as a control signal. Based on this insight, we propose a robust character-wise representation as control and we develop TextRenderNet, which achieves a high text rendering accuracy of over 90%. Another challenge in poster generation is maintaining the fidelity of user-specific products. We address this by introducing SceneGenNet, an inpainting-based model, and propose subject fidelity feedback learning to further enhance fidelity. Based on TextRenderNet and SceneGenNet, we present PosterMaker, an end-to-end generation framework. To optimize PosterMaker efficiently, we implement a two-stage training strategy that decouples text rendering and background generation learning. Experimental results show that PosterMaker outperforms existing baselines by a remarkable margin, which demonstrates its effectiveness.
Related papers
- RepText: Rendering Visual Text via Replicating [15.476598851383919]
We present RepText, which aims to empower pre-trained monolingual text-to-image generation models with the ability to accurately render visual text in user-specified fonts.
Specifically, we adopt the setting from ControlNet and additionally integrate language agnostic glyph and position of rendered text to enable generating harmonized visual text.
Our approach outperforms existing open-source methods and achieves comparable results to native multi-language closed-source models.
arXiv Detail & Related papers (2025-04-28T12:19:53Z) - POSTA: A Go-to Framework for Customized Artistic Poster Generation [87.16343612086959]
POSTA is a modular framework for customized artistic poster generation.
Background Diffusion creates a themed background based on user input.
Design MLLM then generates layout and typography elements that align with and complement the background style.
ArtText Diffusion applies additional stylization to key text elements.
arXiv Detail & Related papers (2025-03-19T05:22:38Z) - DesignDiffusion: High-Quality Text-to-Design Image Generation with Diffusion Models [115.62816053600085]
We present DesignDiffusion, a framework for synthesizing design images from textual descriptions.<n>The proposed framework directly synthesizes textual and visual design elements from user prompts.<n>It utilizes a distinctive character embedding derived from the visual text to enhance the input prompt.
arXiv Detail & Related papers (2025-03-03T15:22:57Z) - Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting [71.29100512700064]
We present T-Prompter, a training-free method for theme-specific image generation.
T-Prompter integrates reference images into generative models, allowing users to seamlessly specify the target theme.
Our approach enables consistent story generation, character design, realistic character generation, and style-guided image generation.
arXiv Detail & Related papers (2025-01-26T19:01:19Z) - GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models [7.152732507491591]
We propose an automatic poster generation framework with text rendering capabilities leveraging LLMs.<n>This framework aims to create precise poster text within a detailed contextual background.<n>We introduce a high-resolution font dataset and a poster dataset with resolutions exceeding 1024 pixels.
arXiv Detail & Related papers (2024-07-02T13:17:49Z) - CustomText: Customized Textual Image Generation using Diffusion Models [13.239661107392324]
Textual image generation spans diverse fields like advertising, education, product packaging, social media, information visualization, and branding.
Despite recent strides in language-guided image synthesis using diffusion models, current models excel in image generation but struggle with accurate text rendering and offer limited control over font attributes.
In this paper, we aim to enhance the synthesis of high-quality images with precise text customization, thereby contributing to the advancement of image generation models.
arXiv Detail & Related papers (2024-05-21T06:43:03Z) - TextPainter: Multimodal Text Image Generation with Visual-harmony and
Text-comprehension for Poster Design [50.8682912032406]
This study introduces TextPainter, a novel multimodal approach to generate text images.
TextPainter takes the global-local background image as a hint of style and guides the text image generation with visual harmony.
We construct the PosterT80K dataset, consisting of about 80K posters annotated with sentence-level bounding boxes and text contents.
arXiv Detail & Related papers (2023-08-09T06:59:29Z) - TextDiffuser: Diffusion Models as Text Painters [118.30923824681642]
We introduce TextDiffuser, focusing on generating images with visually appealing text that is coherent with backgrounds.
We contribute the first large-scale text images dataset with OCR annotations, MARIO-10M, containing 10 million image-text pairs.
We show that TextDiffuser is flexible and controllable to create high-quality text images using text prompts alone or together with text template images, and conduct text inpainting to reconstruct incomplete images with text.
arXiv Detail & Related papers (2023-05-18T10:16:19Z) - Text2Poster: Laying out Stylized Texts on Retrieved Images [32.466518932018175]
Poster generation is a significant task for a wide range of applications, which is often time-consuming and requires lots of manual editing and artistic experience.
We propose a novel data-driven framework, called textitText2Poster, to automatically generate visually-effective posters from textual information.
arXiv Detail & Related papers (2023-01-06T04:06:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.