DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR
Predictions in Ad Serving
- URL: http://arxiv.org/abs/2002.06987v3
- Date: Wed, 6 Jan 2021 22:13:51 GMT
- Title: DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR
Predictions in Ad Serving
- Authors: Wei Deng and Junwei Pan and Tian Zhou and Deguang Kong and Aaron
Flores and Guang Lin
- Abstract summary: Click-through rate (CTR) prediction is a crucial task in online display advertising.
embedding-based neural networks have been proposed to learn both explicit feature interactions.
These sophisticated models, however, slow down the prediction inference by at least hundreds of times.
- Score: 15.637357991632241
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Click-through rate (CTR) prediction is a crucial task in online display
advertising. The embedding-based neural networks have been proposed to learn
both explicit feature interactions through a shallow component and deep feature
interactions using a deep neural network (DNN) component. These sophisticated
models, however, slow down the prediction inference by at least hundreds of
times. To address the issue of significantly increased serving delay and high
memory usage for ad serving in production, this paper presents
\emph{DeepLight}: a framework to accelerate the CTR predictions in three
aspects: 1) accelerate the model inference via explicitly searching informative
feature interactions in the shallow component; 2) prune redundant layers and
parameters at intra-layer and inter-layer level in the DNN component; 3)
promote the sparsity of the embedding layer to preserve the most discriminant
signals. By combining the above efforts, the proposed approach accelerates the
model inference by 46X on Criteo dataset and 27X on Avazu dataset without any
loss on the prediction accuracy. This paves the way for successfully deploying
complicated embedding-based neural networks in production for ad serving.
Related papers
- Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Efficient Graph Neural Network Inference at Large Scale [54.89457550773165]
Graph neural networks (GNNs) have demonstrated excellent performance in a wide range of applications.
Existing scalable GNNs leverage linear propagation to preprocess the features and accelerate the training and inference procedure.
We propose a novel adaptive propagation order approach that generates the personalized propagation order for each node based on its topological information.
arXiv Detail & Related papers (2022-11-01T14:38:18Z) - Deep Multi-Representation Model for Click-Through Rate Prediction [6.155158115218501]
Click-Through Rate prediction (CTR) is a crucial task in recommender systems.
We propose the Deep Multi-Representation model (DeepMR) that jointly trains a mixture of two powerful feature representation learning components.
Experiments on three real-world datasets show that the proposed model significantly outperforms all state-of-the-art models in the task of click-through rate prediction.
arXiv Detail & Related papers (2022-10-18T09:37:11Z) - NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction.
The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network.
A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z) - a novel attention-based network for fast salient object detection [14.246237737452105]
In the current salient object detection network, the most popular method is using U-shape structure.
We propose a new deep convolution network architecture with three contributions.
Results demonstrate that the proposed method can compress the model to 1/3 of the original size nearly without losing the accuracy.
arXiv Detail & Related papers (2021-12-20T12:30:20Z) - AdnFM: An Attentive DenseNet based Factorization Machine for CTR
Prediction [11.958336595818267]
We propose a novel model called Attentive DenseNet based Factorization Machines (AdnFM)
AdnFM can extract more comprehensive deep features by using all the hidden layers from a feed-forward neural network as implicit high-order features.
Experiments on two real-world datasets show that the proposed model can effectively improve the performance of Click-Through-Rate prediction.
arXiv Detail & Related papers (2020-12-20T01:00:39Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - Iterative Boosting Deep Neural Networks for Predicting Click-Through
Rate [15.90144113403866]
The click-through rate (CTR) reflects the ratio of clicks on a specific item to its total number of views.
XdBoost is an iterative three-stage neural network model influenced by the traditional machine learning boosting mechanism.
arXiv Detail & Related papers (2020-07-26T09:41:16Z) - Feature Interaction based Neural Network for Click-Through Rate
Prediction [5.095988654970358]
We propose a Feature Interaction based Neural Network (FINN) which is able to model feature interaction via a 3-dimention relation tensor.
We show that our deep FINN model outperforms other state-of-the-art deep models such as PNN and DeepFM.
It also indicates that our models can effectively learn the feature interactions, and achieve better performances in real-world datasets.
arXiv Detail & Related papers (2020-06-07T03:53:24Z) - Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs.
In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations.
High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.