Related papers: Boost CTR Prediction for New Advertisements via Modeling Visual Content

Boost CTR Prediction for New Advertisements via Modeling Visual Content

URL: http://arxiv.org/abs/2209.11727v1
Date: Fri, 23 Sep 2022 17:08:54 GMT
Title: Boost CTR Prediction for New Advertisements via Modeling Visual Content
Authors: Tan Yu, Zhipeng Jin, Jie Liu, Yi Yang, Hongliang Fei, Ping Li
Abstract summary: We exploit the visual content in ads to boost the performance of CTR prediction models. We learn the embedding for each visual ID based on the historical user-ad interactions accumulated in the past. After incorporating the visual ID embedding in the CTR prediction model of Baidu online advertising, the average CTR of ads improves by 1.46%, and the total charge increases by 1.10%.
Score: 55.11267821243347
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing advertisements click-through rate (CTR) prediction models are mainly dependent on behavior ID features, which are learned based on the historical user-ad interactions. Nevertheless, behavior ID features relying on historical user behaviors are not feasible to describe new ads without previous interactions with users. To overcome the limitations of behavior ID features in modeling new ads, we exploit the visual content in ads to boost the performance of CTR prediction models. Specifically, we map each ad into a set of visual IDs based on its visual content. These visual IDs are further used for generating the visual embedding for enhancing CTR prediction models. We formulate the learning of visual IDs into a supervised quantization problem. Due to a lack of class labels for commercial images in advertisements, we exploit image textual descriptions as the supervision to optimize the image extractor for generating effective visual IDs. Meanwhile, since the hard quantization is non-differentiable, we soften the quantization operation to make it support the end-to-end network training. After mapping each image into visual IDs, we learn the embedding for each visual ID based on the historical user-ad interactions accumulated in the past. Since the visual ID embedding depends only on the visual content, it generalizes well to new ads. Meanwhile, the visual ID embedding complements the ad behavior ID embedding. Thus, it can considerably boost the performance of the CTR prediction models previously relying on behavior ID features for both new ads and ads that have accumulated rich user behaviors. After incorporating the visual ID embedding in the CTR prediction model of Baidu online advertising, the average CTR of ads improves by 1.46%, and the total charge increases by 1.10%.

Related papers

CTR-Driven Advertising Image Generation with Multimodal Large Language Models [53.40005544344148]
We explore the use of Multimodal Large Language Models (MLLMs) for generating advertising images by optimizing for Click-Through Rate (CTR) as the primary objective. To further improve the CTR of generated images, we propose a novel reward model to fine-tune pre-trained MLLMs through Reinforcement Learning (RL) Our method achieves state-of-the-art performance in both online and offline metrics.
arXiv Detail & Related papers (2025-02-05T09:06:02Z)
Ranking-aware adapter for text-driven image ordering with CLIP [76.80965830448781]
We propose an effective yet efficient approach that reframes the CLIP model into a learning-to-rank task. Our approach incorporates learnable prompts to adapt to new instructions for ranking purposes. Our ranking-aware adapter consistently outperforms fine-tuned CLIPs on various tasks.
arXiv Detail & Related papers (2024-12-09T18:51:05Z)
Enhancing Large Vision Language Models with Self-Training on Image Comprehension [131.14381425260706]
We introduce Self-Training on Image (STIC), which emphasizes a self-training approach specifically for image comprehension. First, the model self-constructs a preference for image descriptions using unlabeled images. To further self-improve reasoning on the extracted visual information, we let the model reuse a small portion of existing instruction-tuning data.
arXiv Detail & Related papers (2024-05-30T05:53:49Z)
Attribute-Aware Deep Hashing with Self-Consistency for Large-Scale Fine-Grained Image Retrieval [65.43522019468976]
We propose attribute-aware hashing networks with self-consistency for generating attribute-aware hash codes. We develop an encoder-decoder structure network of a reconstruction task to unsupervisedly distill high-level attribute-specific vectors. Our models are equipped with a feature decorrelation constraint upon these attribute vectors to strengthen their representative abilities.
arXiv Detail & Related papers (2023-11-21T08:20:38Z)
AdSEE: Investigating the Impact of Image Style Editing on Advertisement Attractiveness [25.531489722164178]
We propose Advertisement Style Editing and Attractiveness Enhancement (AdSEE), which explores whether semantic editing to ads images can affect or alter the popularity of online advertisements. We introduce StyleGAN-based facial semantic editing and inversion to ads images and train a click rate predictor attributing GAN-based face latent representations to click rates. Online A/B tests performed over a period of 5 days have verified the increased click-through rates of AdSEE-edited samples as compared to a control group of original ads.
arXiv Detail & Related papers (2023-09-15T04:52:49Z)
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data [68.63453336523318]
We introduce an attention-based memory module, which learns the importance of each retrieved example from the memory. Compared to existing approaches, our method removes the influence of the irrelevant retrieved examples, and retains those that are beneficial to the input query. We show that it achieves state-of-the-art accuracies in ImageNet-LT, Places-LT and Webvision datasets.
arXiv Detail & Related papers (2023-04-11T12:12:05Z)
Hybrid CNN Based Attention with Category Prior for User Image Behavior Modeling [13.984055924772486]
We propose a hybrid CNN based attention module, unifying user's image behaviors and category prior, for CTR prediction. Our approach achieves significant improvements in both online and offline experiments on a billion scale real serving dataset.
arXiv Detail & Related papers (2022-05-05T15:31:47Z)
Learning Graph Meta Embeddings for Cold-Start Ads in Click-Through Rate Prediction [14.709092114902159]
We propose Graph Meta Embedding (GME) models that can rapidly learn how to generate desirable initial embeddings for new ad IDs. Experimental results on three real-world datasets show that GMEs can significantly improve the prediction performance in both cold-start and warm-up.
arXiv Detail & Related papers (2021-05-19T03:46:56Z)
Adversarial Feature Augmentation and Normalization for Visual Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models. Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings. We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
Multi-Channel Sequential Behavior Networks for User Modeling in Online Advertising [4.964012641964141]
This paper presents Multi-Channel Sequential Behavior Network (MC-SBN), a deep learning approach for embedding users and ads in a semantic space. Our proposed user encoder architecture summarizes user activities from multiple input channels--such as previous search queries, visited pages, or clicked ads--into a user vector. The results demonstrate that MC-SBN can improve the ranking of relevant ads and boost the performance of both click prediction and conversion prediction.
arXiv Detail & Related papers (2020-12-27T06:13:29Z)
Iterative Boosting Deep Neural Networks for Predicting Click-Through Rate [15.90144113403866]
The click-through rate (CTR) reflects the ratio of clicks on a specific item to its total number of views. XdBoost is an iterative three-stage neural network model influenced by the traditional machine learning boosting mechanism.
arXiv Detail & Related papers (2020-07-26T09:41:16Z)
Attribute-aware Identity-hard Triplet Loss for Video-based Person Re-identification [51.110453988705395]
Video-based person re-identification (Re-ID) is an important computer vision task. We introduce a new metric learning method called Attribute-aware Identity-hard Triplet Loss (AITL) To achieve a complete model of video-based person Re-ID, a multi-task framework with Attribute-driven Spatio-Temporal Attention (ASTA) mechanism is also proposed.
arXiv Detail & Related papers (2020-06-13T09:15:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.