Hybrid CNN Based Attention with Category Prior for User Image Behavior
Modeling
- URL: http://arxiv.org/abs/2205.02711v1
- Date: Thu, 5 May 2022 15:31:47 GMT
- Title: Hybrid CNN Based Attention with Category Prior for User Image Behavior
Modeling
- Authors: Xin Chen, Qingtao Tang, Ke Hu, Yue Xu, Shihang Qiu, Jia Cheng, Jun Lei
- Abstract summary: We propose a hybrid CNN based attention module, unifying user's image behaviors and category prior, for CTR prediction.
Our approach achieves significant improvements in both online and offline experiments on a billion scale real serving dataset.
- Score: 13.984055924772486
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: User historical behaviors are proved useful for Click Through Rate (CTR)
prediction in online advertising system. In Meituan, one of the largest
e-commerce platform in China, an item is typically displayed with its image and
whether a user clicks the item or not is usually influenced by its image, which
implies that user's image behaviors are helpful for understanding user's visual
preference and improving the accuracy of CTR prediction. Existing user image
behavior models typically use a two-stage architecture, which extracts visual
embeddings of images through off-the-shelf Convolutional Neural Networks (CNNs)
in the first stage, and then jointly trains a CTR model with those visual
embeddings and non-visual features. We find that the two-stage architecture is
sub-optimal for CTR prediction. Meanwhile, precisely labeled categories in
online ad systems contain abundant visual prior information, which can enhance
the modeling of user image behaviors. However, off-the-shelf CNNs without
category prior may extract category unrelated features, limiting CNN's
expression ability. To address the two issues, we propose a hybrid CNN based
attention module, unifying user's image behaviors and category prior, for CTR
prediction. Our approach achieves significant improvements in both online and
offline experiments on a billion scale real serving dataset.
Related papers
- An evaluation of CNN models and data augmentation techniques in hierarchical localization of mobile robots [0.0]
This work presents an evaluation of CNN models and data augmentation to carry out the hierarchical localization of a mobile robot.
In this sense, an ablation study of different state-of-the-art CNN models used as backbone is presented.
A variety of data augmentation visual effects are proposed for addressing the visual localization of the robot.
arXiv Detail & Related papers (2024-07-15T10:20:00Z) - ContentCTR: Frame-level Live Streaming Click-Through Rate Prediction
with Multimodal Transformer [31.10377461705053]
We propose a ContentCTR model that leverages multimodal transformer for frame-level CTR prediction.
We conduct extensive experiments on both real-world scenarios and public datasets, and our ContentCTR model outperforms traditional recommendation models in capturing real-time content changes.
arXiv Detail & Related papers (2023-06-26T03:04:53Z) - Boost CTR Prediction for New Advertisements via Modeling Visual Content [55.11267821243347]
We exploit the visual content in ads to boost the performance of CTR prediction models.
We learn the embedding for each visual ID based on the historical user-ad interactions accumulated in the past.
After incorporating the visual ID embedding in the CTR prediction model of Baidu online advertising, the average CTR of ads improves by 1.46%, and the total charge increases by 1.10%.
arXiv Detail & Related papers (2022-09-23T17:08:54Z) - Corrupted Image Modeling for Self-Supervised Visual Pre-Training [103.99311611776697]
We introduce Corrupted Image Modeling (CIM) for self-supervised visual pre-training.
CIM uses an auxiliary generator with a small trainable BEiT to corrupt the input image instead of using artificial mask tokens.
After pre-training, the enhancer can be used as a high-capacity visual encoder for downstream tasks.
arXiv Detail & Related papers (2022-02-07T17:59:04Z) - Masked Transformer for Neighhourhood-aware Click-Through Rate Prediction [74.52904110197004]
We propose Neighbor-Interaction based CTR prediction, which put this task into a Heterogeneous Information Network (HIN) setting.
In order to enhance the representation of the local neighbourhood, we consider four types of topological interaction among the nodes.
We conduct comprehensive experiments on two real world datasets and the experimental results show that our proposed method outperforms state-of-the-art CTR models significantly.
arXiv Detail & Related papers (2022-01-25T12:44:23Z) - Improving Conversational Recommendation System by Pretraining on
Billions Scale of Knowledge Graph [29.093477601914355]
We propose a novel knowledge-enhanced deep cross network (K-DCN) to recommend items.
We first construct a billion-scale conversation knowledge graph (CKG) from information about users, items and conversations.
We then pretrain CKG by introducing knowledge graph embedding method and graph convolution network to encode semantic and structural information.
In K-DCN, we fuse the user-state representation, dialogue-interaction representation and other normal feature representations via deep cross network, which will give the rank of candidate items to be recommended.
arXiv Detail & Related papers (2021-04-30T10:56:41Z) - The Mind's Eye: Visualizing Class-Agnostic Features of CNNs [92.39082696657874]
We propose an approach to visually interpret CNN features given a set of images by creating corresponding images that depict the most informative features of a specific layer.
Our method uses a dual-objective activation and distance loss, without requiring a generator network nor modifications to the original model.
arXiv Detail & Related papers (2021-01-29T07:46:39Z) - Learning CNN filters from user-drawn image markers for coconut-tree
image classification [78.42152902652215]
We present a method that needs a minimal set of user-selected images to train the CNN's feature extractor.
The method learns the filters of each convolutional layer from user-drawn markers in image regions that discriminate classes.
It does not rely on optimization based on backpropagation, and we demonstrate its advantages on the binary classification of coconut-tree aerial images.
arXiv Detail & Related papers (2020-08-08T15:50:23Z) - Category-Specific CNN for Visual-aware CTR Prediction at JD.com [47.09978876513512]
We propose Category-specific CNN (CSCNN) for Click Through Rate (CTR) prediction.
CSCNN early incorporates the category knowledge with a light-weighted attention- module on each convolutional layer.
This enables CSCNN to extract expressive category-specific visual patterns that benefit the CTR prediction.
arXiv Detail & Related papers (2020-06-18T07:52:27Z) - Improving Native Ads CTR Prediction by Large Scale Event Embedding and
Recurrent Networks [2.0902732379491207]
We propose a large-scale event embedding scheme to encode the each user browsing event by training a Siamese network with weak supervision on the users' consecutive events.
The CTR prediction problem is modeled as a supervised recurrent neural network, which naturally model the user history as a sequence of events.
arXiv Detail & Related papers (2018-04-24T16:50:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.