GateNet: Gating-Enhanced Deep Network for Click-Through Rate Prediction
- URL: http://arxiv.org/abs/2007.03519v1
- Date: Mon, 6 Jul 2020 12:45:46 GMT
- Title: GateNet: Gating-Enhanced Deep Network for Click-Through Rate Prediction
- Authors: Tongwen Huang, Qingyun She, Zhiqiang Wang, Junlin Zhang
- Abstract summary: In recent years, many neural network based CTR models have been proposed and achieved success.
We propose a novel model named GateNet which introduces either the feature embedding gate or the hidden gate to the embedding layer or hidden CTR models.
- Score: 3.201333208812837
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Advertising and feed ranking are essential to many Internet companies such as
Facebook. Among many real-world advertising and feed ranking systems, click
through rate (CTR) prediction plays a central role. In recent years, many
neural network based CTR models have been proposed and achieved success such as
Factorization-Machine Supported Neural Networks, DeepFM and xDeepFM. Many of
them contain two commonly used components: embedding layer and MLP hidden
layers. On the other side, gating mechanism is also widely applied in many
research fields such as computer vision(CV) and natural language
processing(NLP). Some research has proved that gating mechanism improves the
trainability of non-convex deep neural networks. Inspired by these
observations, we propose a novel model named GateNet which introduces either
the feature embedding gate or the hidden gate to the embedding layer or hidden
layers of DNN CTR models, respectively. The feature embedding gate provides a
learnable feature gating module to select salient latent information from the
feature-level. The hidden gate helps the model to implicitly capture the
high-order interaction more effectively. Extensive experiments conducted on
three real-world datasets demonstrate its effectiveness to boost the
performance of various state-of-the-art models such as FM, DeepFM and xDeepFM
on all datasets.
Related papers
- Multilinear Operator Networks [60.7432588386185]
Polynomial Networks is a class of models that does not require activation functions.
We propose MONet, which relies solely on multilinear operators.
arXiv Detail & Related papers (2024-01-31T16:52:19Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - SENetV2: Aggregated dense layer for channelwise and global
representations [0.0]
We introduce a novel aggregated multilayer perceptron, a multi-branch dense layer, within the Squeeze residual module.
This fusion enhances the network's ability to capture channel-wise patterns and have global knowledge.
We conduct extensive experiments on benchmark datasets to validate the model and compare them with established architectures.
arXiv Detail & Related papers (2023-11-17T14:10:57Z) - Improving the Robustness of Deep Convolutional Neural Networks Through
Feature Learning [23.5067878531607]
Deep convolutional neural network (DCNN for short) models are vulnerable to examples with small perturbations.
Adversarial training (AT for short) is a widely used approach to enhance the robustness of DCNN models by data augmentation.
This paper proposes a shallow binary feature module (SBFM for short) which can be integrated into any popular backbone.
arXiv Detail & Related papers (2023-03-11T15:22:29Z) - Interpretability of an Interaction Network for identifying $H
\rightarrow b\bar{b}$ jets [4.553120911976256]
In recent times, AI models based on deep neural networks are becoming increasingly popular for many of these applications.
We explore interpretability of AI models by examining an Interaction Network (IN) model designed to identify boosted $Hto bbarb$ jets.
We additionally illustrate the activity of hidden layers within the IN model as Neural Activation Pattern (NAP) diagrams.
arXiv Detail & Related papers (2022-11-23T08:38:52Z) - Deep Multi-Representation Model for Click-Through Rate Prediction [6.155158115218501]
Click-Through Rate prediction (CTR) is a crucial task in recommender systems.
We propose the Deep Multi-Representation model (DeepMR) that jointly trains a mixture of two powerful feature representation learning components.
Experiments on three real-world datasets show that the proposed model significantly outperforms all state-of-the-art models in the task of click-through rate prediction.
arXiv Detail & Related papers (2022-10-18T09:37:11Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - A Battle of Network Structures: An Empirical Study of CNN, Transformer,
and MLP [121.35904748477421]
Convolutional neural networks (CNN) are the dominant deep neural network (DNN) architecture for computer vision.
Transformer and multi-layer perceptron (MLP)-based models, such as Vision Transformer and Vision-Mixer, started to lead new trends.
In this paper, we conduct empirical studies on these DNN structures and try to understand their respective pros and cons.
arXiv Detail & Related papers (2021-08-30T06:09:02Z) - Global Filter Networks for Image Classification [90.81352483076323]
We present a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
Our results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness.
arXiv Detail & Related papers (2021-07-01T17:58:16Z) - AdnFM: An Attentive DenseNet based Factorization Machine for CTR
Prediction [11.958336595818267]
We propose a novel model called Attentive DenseNet based Factorization Machines (AdnFM)
AdnFM can extract more comprehensive deep features by using all the hidden layers from a feed-forward neural network as implicit high-order features.
Experiments on two real-world datasets show that the proposed model can effectively improve the performance of Click-Through-Rate prediction.
arXiv Detail & Related papers (2020-12-20T01:00:39Z) - Refined Gate: A Simple and Effective Gating Mechanism for Recurrent
Units [68.30422112784355]
We propose a new gating mechanism within general gated recurrent neural networks to handle this issue.
The proposed gates directly short connect the extracted input features to the outputs of vanilla gates.
We verify the proposed gating mechanism on three popular types of gated RNNs including LSTM, GRU and MGU.
arXiv Detail & Related papers (2020-02-26T07:51:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.