DCAP: Deep Cross Attentional Product Network for User Response
Prediction
- URL: http://arxiv.org/abs/2105.08649v1
- Date: Tue, 18 May 2021 16:27:20 GMT
- Title: DCAP: Deep Cross Attentional Product Network for User Response
Prediction
- Authors: Zekai Chen, Fangtian Zhong, Zhumin Chen, Xiao Zhang, Robert Pless,
Xiuzhen Cheng
- Abstract summary: We propose a novel architecture Deep Cross Attentional Product Network (DCAP)
DCAP keeps cross network's benefits in modeling high-order feature interactions explicitly at the vector-wise level.
Our proposed model can be easily implemented and train in parallel.
- Score: 20.17934000984361
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: User response prediction, which aims to predict the probability that a user
will provide a predefined positive response in a given context such as clicking
on an ad or purchasing an item, is crucial to many industrial applications such
as online advertising, recommender systems, and search ranking. However, due to
the high dimensionality and super sparsity of the data collected in these
tasks, handcrafting cross features is inevitably time expensive. Prior studies
in predicting user response leveraged the feature interactions by enhancing
feature vectors with products of features to model second-order or high-order
cross features, either explicitly or implicitly. Nevertheless, these existing
methods can be hindered by not learning sufficient cross features due to model
architecture limitations or modeling all high-order feature interactions with
equal weights. This work aims to fill this gap by proposing a novel
architecture Deep Cross Attentional Product Network (DCAP), which keeps cross
network's benefits in modeling high-order feature interactions explicitly at
the vector-wise level. Beyond that, it can differentiate the importance of
different cross features in each network layer inspired by the multi-head
attention mechanism and Product Neural Network (PNN), allowing practitioners to
perform a more in-depth analysis of user behaviors. Additionally, our proposed
model can be easily implemented and train in parallel. We conduct comprehensive
experiments on three real-world datasets. The results have robustly
demonstrated that our proposed model DCAP achieves superior prediction
performance compared with the state-of-the-art models.
Related papers
- A Click-Through Rate Prediction Method Based on Cross-Importance of Multi-Order Features [4.820576346277399]
This paper proposes a new model, FiiNet (Multiple Order Feature Interaction Importance Neural Networks)
The model first uses the selective kernel network (SKNet) to explicitly construct multi-order feature crosses.
It dynamically learns the importance of feature interaction combinations in a fine grained manner.
arXiv Detail & Related papers (2024-05-14T16:05:57Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Cross-modal Orthogonal High-rank Augmentation for RGB-Event
Transformer-trackers [58.802352477207094]
We explore the great potential of a pre-trained vision Transformer (ViT) to bridge the vast distribution gap between two modalities.
We propose a mask modeling strategy that randomly masks a specific modality of some tokens to enforce the interaction between tokens from different modalities interacting proactively.
Experiments demonstrate that our plug-and-play training augmentation techniques can significantly boost state-of-the-art one-stream and two trackersstream to a large extent in terms of both tracking precision and success rate.
arXiv Detail & Related papers (2023-07-09T08:58:47Z) - Multi-Behavior Hypergraph-Enhanced Transformer for Sequential
Recommendation [33.97708796846252]
We introduce a new Multi-Behavior Hypergraph-enhanced Transformer framework (MBHT) to capture both short-term and long-term cross-type behavior dependencies.
Specifically, a multi-scale Transformer is equipped with low-rank self-attention to jointly encode behavior-aware sequential patterns from fine-grained and coarse-grained levels.
arXiv Detail & Related papers (2022-07-12T15:07:21Z) - Preference Enhanced Social Influence Modeling for Network-Aware Cascade
Prediction [59.221668173521884]
We propose a novel framework to promote cascade size prediction by enhancing the user preference modeling.
Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate.
arXiv Detail & Related papers (2022-04-18T09:25:06Z) - Correlation-Aware Deep Tracking [83.51092789908677]
We propose a novel target-dependent feature network inspired by the self-/cross-attention scheme.
Our network deeply embeds cross-image feature correlation in multiple layers of the feature network.
Our model can be flexibly pre-trained on abundant unpaired images, leading to notably faster convergence than the existing methods.
arXiv Detail & Related papers (2022-03-03T11:53:54Z) - XCrossNet: Feature Structure-Oriented Learning for Click-Through Rate
Prediction [46.72935114485706]
We propose a novel Extreme Cross Network, abbreviated XCrossNet, which aims at learning dense and sparse feature interactions in an explicit manner.
XCrossNet as a feature structure-oriented model leads to a more expressive representation and a more precise CTR prediction.
Experimental studies on Criteo Kaggle dataset show significant improvement of XCrossNet over state-of-the-art models on both effectiveness and efficiency.
arXiv Detail & Related papers (2021-04-22T07:37:36Z) - Probabilistic Graph Attention Network with Conditional Kernels for
Pixel-Wise Prediction [158.88345945211185]
We present a novel approach that advances the state of the art on pixel-level prediction in a fundamental aspect, i.e. structured multi-scale features learning and fusion.
We propose a probabilistic graph attention network structure based on a novel Attention-Gated Conditional Random Fields (AG-CRFs) model for learning and fusing multi-scale representations in a principled manner.
arXiv Detail & Related papers (2021-01-08T04:14:29Z) - AdnFM: An Attentive DenseNet based Factorization Machine for CTR
Prediction [11.958336595818267]
We propose a novel model called Attentive DenseNet based Factorization Machines (AdnFM)
AdnFM can extract more comprehensive deep features by using all the hidden layers from a feed-forward neural network as implicit high-order features.
Experiments on two real-world datasets show that the proposed model can effectively improve the performance of Click-Through-Rate prediction.
arXiv Detail & Related papers (2020-12-20T01:00:39Z) - Learning Long-term Visual Dynamics with Region Proposal Interaction
Networks [75.06423516419862]
We build object representations that can capture inter-object and object-environment interactions over a long-range.
Thanks to the simple yet effective object representation, our approach outperforms prior methods by a significant margin.
arXiv Detail & Related papers (2020-08-05T17:48:00Z) - Feature Interaction based Neural Network for Click-Through Rate
Prediction [5.095988654970358]
We propose a Feature Interaction based Neural Network (FINN) which is able to model feature interaction via a 3-dimention relation tensor.
We show that our deep FINN model outperforms other state-of-the-art deep models such as PNN and DeepFM.
It also indicates that our models can effectively learn the feature interactions, and achieve better performances in real-world datasets.
arXiv Detail & Related papers (2020-06-07T03:53:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.