PAID: A Framework of Product-Centric Advertising Image Design
- URL: http://arxiv.org/abs/2501.14316v2
- Date: Wed, 12 Feb 2025 06:48:03 GMT
- Title: PAID: A Framework of Product-Centric Advertising Image Design
- Authors: Hongyu Chen, Min Zhou, Jing Jiang, Jiale Chen, Yang Lu, Bo Xiao, Tiezheng Ge, Bo Zheng,
- Abstract summary: We propose a novel framework called Product-Centric Advertising Image Design (PAID)<n>It consists of four sequential stages to highlight product foregrounds and taglines while achieving overall image aesthetics.<n>To support the PAID framework, we create corresponding datasets with over 50,000 labeled images.
- Score: 31.08944590096747
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Creating visually appealing advertising images is often a labor-intensive and time-consuming process. Is it possible to automatically generate such images using only basic product information--specifically, a product foreground image, taglines, and a target size? Existing methods mainly focus on parts of the problem and fail to provide a comprehensive solution. To address this gap, we propose a novel multistage framework called Product-Centric Advertising Image Design (PAID). It consists of four sequential stages to highlight product foregrounds and taglines while achieving overall image aesthetics: prompt generation, layout generation, background image generation, and graphics rendering. Different expert models are designed and trained for the first three stages: First, we use a visual language model (VLM) to generate background prompts that match the products. Next, a VLM-based layout generation model arranges the placement of product foregrounds, graphic elements (taglines and decorative underlays), and various nongraphic elements (objects from the background prompt). Following this, we train an SDXL-based image generation model that can simultaneously accept prompts, layouts, and foreground controls. To support the PAID framework, we create corresponding datasets with over 50,000 labeled images. Extensive experimental results and online A/B tests demonstrate that PAID can produce more visually appealing advertising images.
Related papers
- CTR-Driven Advertising Image Generation with Multimodal Large Language Models [53.40005544344148]
We explore the use of Multimodal Large Language Models (MLLMs) for generating advertising images by optimizing for Click-Through Rate (CTR) as the primary objective.
To further improve the CTR of generated images, we propose a novel reward model to fine-tune pre-trained MLLMs through Reinforcement Learning (RL)
Our method achieves state-of-the-art performance in both online and offline metrics.
arXiv Detail & Related papers (2025-02-05T09:06:02Z) - Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting [71.29100512700064]
We present T-Prompter, a training-free method for theme-specific image generation.
T-Prompter integrates reference images into generative models, allowing users to seamlessly specify the target theme.
Our approach enables consistent story generation, character design, realistic character generation, and style-guided image generation.
arXiv Detail & Related papers (2025-01-26T19:01:19Z) - Desigen: A Pipeline for Controllable Design Template Generation [69.51563467689795]
Desigen is an automatic template creation pipeline which generates background images as well as layout elements over the background.
We propose two techniques to constrain the saliency distribution and reduce the attention weight in desired regions during the background generation process.
Experiments demonstrate that the proposed pipeline generates high-quality templates comparable to human designers.
arXiv Detail & Related papers (2024-03-14T04:32:28Z) - Chaining text-to-image and large language model: A novel approach for generating personalized e-commerce banners [8.508453886143677]
We demonstrate the use of text-to-image models for generating personalized web banners for online shoppers.
The novelty in this approach lies in converting users' interaction data to meaningful prompts without human intervention.
Our results show that the proposed approach can create high-quality personalized banners for users.
arXiv Detail & Related papers (2024-02-28T07:56:04Z) - Planning and Rendering: Towards Product Poster Generation with Diffusion Models [21.45855580640437]
We propose a novel product poster generation framework based on diffusion models named P&R.
At the planning stage, we propose a PlanNet to generate the layout of the product and other visual components.
At the rendering stage, we propose a RenderNet to generate the background for the product while considering the generated layout.
Our method outperforms the state-of-the-art product poster generation methods on PPG30k.
arXiv Detail & Related papers (2023-12-14T11:11:50Z) - Staging E-Commerce Products for Online Advertising using Retrieval
Assisted Image Generation [11.03803158931361]
We propose a generative adversarial network (GAN) based approach to generate staged backgrounds for un-staged product images.
We show how our staging approach can enable animations of moving products leading to a video ad from a product image.
arXiv Detail & Related papers (2023-07-28T06:04:46Z) - Generating Images with Multimodal Language Models [78.6660334861137]
We propose a method to fuse frozen text-only large language models with pre-trained image encoder and decoder models.
Our model demonstrates a wide suite of multimodal capabilities: image retrieval, novel image generation, and multimodal dialogue.
arXiv Detail & Related papers (2023-05-26T19:22:03Z) - ProSpect: Prompt Spectrum for Attribute-Aware Personalization of
Diffusion Models [77.03361270726944]
Current personalization methods can invert an object or concept into the textual conditioning space and compose new natural sentences for text-to-image diffusion models.
We propose a novel approach that leverages the step-by-step generation process of diffusion models, which generate images from low to high frequency information.
We apply ProSpect in various personalized attribute-aware image generation applications, such as image-guided or text-driven manipulations of materials, style, and layout.
arXiv Detail & Related papers (2023-05-25T16:32:01Z) - Unsupervised Domain Adaption with Pixel-level Discriminator for
Image-aware Layout Generation [24.625282719753915]
This paper focuses on using the GAN-based model conditioned on image contents to generate advertising poster graphic layouts.
It combines unsupervised domain techniques to design a GAN with a novel pixel-level discriminator (PD), called PDA-GAN, to generate graphic layouts according to image contents.
Both quantitative and qualitative evaluations demonstrate that PDA-GAN can achieve state-of-the-art performances.
arXiv Detail & Related papers (2023-03-25T06:50:22Z) - LayoutDETR: Detection Transformer Is a Good Multimodal Layout Designer [80.61492265221817]
Graphic layout designs play an essential role in visual communication.
Yet handcrafting layout designs is skill-demanding, time-consuming, and non-scalable to batch production.
Generative models emerge to make design automation scalable but it remains non-trivial to produce designs that comply with designers' desires.
arXiv Detail & Related papers (2022-12-19T21:57:35Z) - Automatic Generation of Product-Image Sequence in E-commerce [46.06263129000091]
Multi-modality Unified Imagesequence (MUIsC) is able to simultaneously detect all categories through learning rule violations.
By Dec 2021, our AGPIS framework has generated high-standard images for about 1.5 million products and achieves 13.6% in reject rate.
arXiv Detail & Related papers (2022-06-26T23:38:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.