Related papers: DreamPoster: A Unified Framework for Image-Conditioned Generative Poster Design

DreamPoster: A Unified Framework for Image-Conditioned Generative Poster Design

URL: http://arxiv.org/abs/2507.04218v1
Date: Sun, 06 Jul 2025 03:06:45 GMT
Title: DreamPoster: A Unified Framework for Image-Conditioned Generative Poster Design
Authors: Xiwei Hu, Haokun Chen, Zhongqi Qi, Hui Zhang, Dexiang Hong, Jie Shao, Xinglong Wu,
Abstract summary: We present DreamPoster, a Text-to-Image generation framework that intelligently synthesizes high-quality posters from user-provided images and text prompts.<n>For dataset construction, we propose a systematic data annotation pipeline that precisely annotates textual content and typographic hierarchy information.<n>We implement a progressive training strategy that enables the model to hierarchically acquire multi-task generation capabilities while maintaining high-quality generation.
Score: 8.913908898296626
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: We present DreamPoster, a Text-to-Image generation framework that intelligently synthesizes high-quality posters from user-provided images and text prompts while maintaining content fidelity and supporting flexible resolution and layout outputs. Specifically, DreamPoster is built upon our T2I model, Seedream3.0 to uniformly process different poster generating types. For dataset construction, we propose a systematic data annotation pipeline that precisely annotates textual content and typographic hierarchy information within poster images, while employing comprehensive methodologies to construct paired datasets comprising source materials (e.g., raw graphics/text) and their corresponding final poster outputs. Additionally, we implement a progressive training strategy that enables the model to hierarchically acquire multi-task generation capabilities while maintaining high-quality generation. Evaluations on our testing benchmarks demonstrate DreamPoster's superiority over existing methods, achieving a high usability rate of 88.55\%, compared to GPT-4o (47.56\%) and SeedEdit3.0 (25.96\%). DreamPoster will be online in Jimeng and other Bytedance Apps.

Related papers

PosterCraft: Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework [26.60241017305203]
PosterCraft is a unified framework that abandons prior modular pipelines and rigid, predefined layouts.<n>It employs a carefully designed, cascaded workflow to optimize the generation of high-aesthetic posters.<n>PosterCraft significantly outperforms open-source baselines in rendering accuracy, layout coherence, and overall visual appeal.
arXiv Detail & Related papers (2025-06-12T14:28:12Z)
Seedream 3.0 Technical Report [62.85849652170507]
Seedream 3.0 is a high-performance Chinese-English bilingual image generation foundation model.<n>We develop several technical improvements to address existing challenges in Seedream 2.0.<n>Seedream 3.0 provides native high-resolution output (up to 2K) allowing it to generate images with high visual quality.
arXiv Detail & Related papers (2025-04-15T16:19:07Z)
PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering [50.76106125697899]
Product posters, which integrate subject, scene, and text, are crucial promotional tools for attracting customers.<n>Main challenge lies in accurately rendering text, especially for complex writing systems like Chinese, which contains over 10,000 individual characters.<n>We develop TextRenderNet, which achieves a high text rendering accuracy of over 90%.<n>Based on TextRenderNet and SceneGenNet, we present PosterMaker, an end-to-end generation framework.
arXiv Detail & Related papers (2025-04-09T07:13:08Z)
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models [7.152732507491591]
We propose an automatic poster generation framework with text rendering capabilities leveraging LLMs.<n>This framework aims to create precise poster text within a detailed contextual background.<n>We introduce a high-resolution font dataset and a poster dataset with resolutions exceeding 1024 pixels.
arXiv Detail & Related papers (2024-07-02T13:17:49Z)
Planning and Rendering: Towards Product Poster Generation with Diffusion Models [21.45855580640437]
We propose a novel product poster generation framework based on diffusion models named P&R. At the planning stage, we propose a PlanNet to generate the layout of the product and other visual components. At the rendering stage, we propose a RenderNet to generate the background for the product while considering the generated layout. Our method outperforms the state-of-the-art product poster generation methods on PPG30k.
arXiv Detail & Related papers (2023-12-14T11:11:50Z)
Paragraph-to-Image Generation with Information-Enriched Diffusion Model [62.81033771780328]
ParaDiffusion is an information-enriched diffusion model for paragraph-to-image generation task.<n>It delves into the transference of the extensive semantic comprehension capabilities of large language models to the task of image generation.<n>The code and dataset will be released to foster community research on long-text alignment.
arXiv Detail & Related papers (2023-11-24T05:17:01Z)
Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion [50.59261592343479]
We present Kandinsky1, a novel exploration of latent diffusion architecture. The proposed model is trained separately to map text embeddings to image embeddings of CLIP. We also deployed a user-friendly demo system that supports diverse generative modes such as text-to-image generation, image fusion, text and image fusion, image variations generation, and text-guided inpainting/outpainting.
arXiv Detail & Related papers (2023-10-05T12:29:41Z)
TextPainter: Multimodal Text Image Generation with Visual-harmony and Text-comprehension for Poster Design [50.8682912032406]
This study introduces TextPainter, a novel multimodal approach to generate text images. TextPainter takes the global-local background image as a hint of style and guides the text image generation with visual harmony. We construct the PosterT80K dataset, consisting of about 80K posters annotated with sentence-level bounding boxes and text contents.
arXiv Detail & Related papers (2023-08-09T06:59:29Z)
Text2Poster: Laying out Stylized Texts on Retrieved Images [32.466518932018175]
Poster generation is a significant task for a wide range of applications, which is often time-consuming and requires lots of manual editing and artistic experience. We propose a novel data-driven framework, called textitText2Poster, to automatically generate visually-effective posters from textual information.
arXiv Detail & Related papers (2023-01-06T04:06:23Z)
Towards Open-World Text-Guided Face Image Generation and Manipulation [52.83401421019309]
We propose a unified framework for both face image generation and manipulation. Our method supports open-world scenarios, including both image and text, without any re-training, fine-tuning, or post-processing.
arXiv Detail & Related papers (2021-04-18T16:56:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.