Planning and Rendering: Towards Product Poster Generation with Diffusion Models
- URL: http://arxiv.org/abs/2312.08822v2
- Date: Tue, 3 Sep 2024 07:42:44 GMT
- Title: Planning and Rendering: Towards Product Poster Generation with Diffusion Models
- Authors: Zhaochen Li, Fengheng Li, Wei Feng, Honghe Zhu, Yaoyu Li, Zheng Zhang, Jingjing Lv, Junjie Shen, Zhangang Lin, Jingping Shao, Zhenglu Yang,
- Abstract summary: We propose a novel product poster generation framework based on diffusion models named P&R.
At the planning stage, we propose a PlanNet to generate the layout of the product and other visual components.
At the rendering stage, we propose a RenderNet to generate the background for the product while considering the generated layout.
Our method outperforms the state-of-the-art product poster generation methods on PPG30k.
- Score: 21.45855580640437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Product poster generation significantly optimizes design efficiency and reduces production costs. Prevailing methods predominantly rely on image-inpainting methods to generate clean background images for given products. Subsequently, poster layout generation methods are employed to produce corresponding layout results. However, the background images may not be suitable for accommodating textual content due to their complexity, and the fixed location of products limits the diversity of layout results. To alleviate these issues, we propose a novel product poster generation framework based on diffusion models named P\&R. The P\&R draws inspiration from the workflow of designers in creating posters, which consists of two stages: Planning and Rendering. At the planning stage, we propose a PlanNet to generate the layout of the product and other visual components considering both the appearance features of the product and semantic features of the text, which improves the diversity and rationality of the layouts. At the rendering stage, we propose a RenderNet to generate the background for the product while considering the generated layout, where a spatial fusion module is introduced to fuse the layout of different visual components. To foster the advancement of this field, we propose the first product poster generation dataset PPG30k, comprising 30k exquisite product poster images along with comprehensive image and text annotations. Our method outperforms the state-of-the-art product poster generation methods on PPG30k. The PPG30k will be released soon.
Related papers
- TextLap: Customizing Language Models for Text-to-Layout Planning [65.02105936609021]
We call our method TextLap (text-based layout planning)
It uses a curated instruction-based layout planning dataset (InsLap) to customize Large language models (LLMs) as a graphic designer.
We demonstrate the effectiveness of TextLap and show that it outperforms strong baselines, including GPT-4 based methods, for image generation and graphical design benchmarks.
arXiv Detail & Related papers (2024-10-09T19:51:38Z) - GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models [7.5791485306093245]
We propose an automatic poster generation framework with text rendering capabilities leveraging LLMs.
This framework aims to create precise poster text within a detailed contextual background.
We introduce a high-resolution font dataset and a poster dataset with resolutions exceeding 1024 pixels.
arXiv Detail & Related papers (2024-07-02T13:17:49Z) - MaPa: Text-driven Photorealistic Material Painting for 3D Shapes [80.66880375862628]
This paper aims to generate materials for 3D meshes from text descriptions.
Unlike existing methods that synthesize texture maps, we propose to generate segment-wise procedural material graphs.
Our framework supports high-quality rendering and provides substantial flexibility in editing.
arXiv Detail & Related papers (2024-04-26T17:54:38Z) - Desigen: A Pipeline for Controllable Design Template Generation [69.51563467689795]
Desigen is an automatic template creation pipeline which generates background images as well as layout elements over the background.
We propose two techniques to constrain the saliency distribution and reduce the attention weight in desired regions during the background generation process.
Experiments demonstrate that the proposed pipeline generates high-quality templates comparable to human designers.
arXiv Detail & Related papers (2024-03-14T04:32:28Z) - AutoPoster: A Highly Automatic and Content-aware Design System for
Advertising Poster Generation [14.20790443380675]
This paper introduces AutoPoster, a highly automatic and content-aware system for generating advertising posters.
With only product images and titles as inputs, AutoPoster can automatically produce posters of varying sizes through four key stages.
We propose the first poster generation dataset that includes visual attribute annotations for over 76k posters.
arXiv Detail & Related papers (2023-08-02T11:58:43Z) - ProSpect: Prompt Spectrum for Attribute-Aware Personalization of
Diffusion Models [77.03361270726944]
Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models.
We propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information.
We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout.
arXiv Detail & Related papers (2023-05-25T16:32:01Z) - LayoutGPT: Compositional Visual Planning and Generation with Large
Language Models [98.81962282674151]
Large Language Models (LLMs) can serve as visual planners by generating layouts from text conditions.
We propose LayoutGPT, a method to compose in-context visual demonstrations in style sheet language.
arXiv Detail & Related papers (2023-05-24T17:56:16Z) - PosterLayout: A New Benchmark and Approach for Content-aware
Visual-Textual Presentation Layout [62.12447593298437]
Content-aware visual-textual presentation layout aims at arranging spatial space on the given canvas for pre-defined elements.
We propose design sequence formation (DSF) that reorganizes elements in layouts to imitate the design processes of human designers.
A novel CNN-LSTM-based conditional generative adversarial network (GAN) is presented to generate proper layouts.
arXiv Detail & Related papers (2023-03-28T12:48:36Z) - Unsupervised Domain Adaption with Pixel-level Discriminator for
Image-aware Layout Generation [24.625282719753915]
This paper focuses on using the GAN-based model conditioned on image contents to generate advertising poster graphic layouts.
It combines unsupervised domain techniques to design a GAN with a novel pixel-level discriminator (PD), called PDA-GAN, to generate graphic layouts according to image contents.
Both quantitative and qualitative evaluations demonstrate that PDA-GAN can achieve state-of-the-art performances.
arXiv Detail & Related papers (2023-03-25T06:50:22Z) - Composition-aware Graphic Layout GAN for Visual-textual Presentation
Designs [24.29890251913182]
We study the graphic layout generation problem of producing high-quality visual-textual presentation designs for given images.
We propose a deep generative model, dubbed as composition-aware graphic layout GAN (CGL-GAN), to synthesize layouts based on the global and spatial visual contents of input images.
arXiv Detail & Related papers (2022-04-30T16:42:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.