Related papers: DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction

DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction

URL: http://arxiv.org/abs/2407.13349v6
Date: Fri, 09 Aug 2024 06:31:56 GMT
Title: DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction
Authors: Honghao Li, Yiwen Zhang, Yi Zhang, Hanwei Li, Lei Sang, Jieming Zhu,
Abstract summary: This paper proposes the next generation deep cross network: Deep Cross Network v3 (DCNv3), along with its two sub-networks: Linear Cross Network (LCN) and Exponential Cross Network (ECN) for CTR prediction. Comprehensive experiments on six datasets demonstrate the effectiveness, efficiency, and interpretability of DCNv3.
Score: 17.19859591493946
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep & Cross Network and its derivative models have become an important paradigm for click-through rate (CTR) prediction due to their effective balance between computational cost and performance. However, these models face four major limitations: (1) the performance of existing explicit feature interaction methods is often weaker than that of implicit deep neural network (DNN), undermining their necessity; (2) many models fail to adaptively filter noise while increasing the order of feature interactions; (3) the fusion methods of most models cannot provide suitable supervision signals for their different sub-networks; (4) while most models claim to capture high-order feature interactions, they often do so implicitly and non-interpretably through DNN, which limits the trustworthiness of the model's predictions. To address the identified limitations, this paper proposes the next generation deep cross network: Deep Cross Network v3 (DCNv3), along with its two sub-networks: Linear Cross Network (LCN) and Exponential Cross Network (ECN) for CTR prediction. DCNv3 ensures interpretability in feature interaction modeling while linearly and exponentially increasing the order of feature interactions to achieve genuine Deep Crossing rather than just Deep & Cross. Additionally, we employ a Self-Mask operation to filter noise and reduce the number of parameters in the Cross Network by half. In the fusion layer, we use a simple yet effective multi-loss trade-off and calculation method, called Tri-BCE, to provide appropriate supervision signals. Comprehensive experiments on six datasets demonstrate the effectiveness, efficiency, and interpretability of DCNv3. The code, running logs, and detailed hyperparameter configurations are available at: https://github.com/salmon1802/DCNv3.

Related papers

Efficient Federated Learning Tiny Language Models for Mobile Network Feature Prediction [13.32608465848856]
In telecommunications, Autonomous Networks (ANs) automatically adjust configurations based on specific requirements (e.g. bandwidth, available resources) Here, Federated Learning (FL) allows multiple AN cells - each equipped with Neural Networks (NNs) - to collaboratively train models while preserving data privacy. We investigate NNCodec, a implementation of the ISO/IEC Neural Network Coding (NNC) standard, within a novel FL framework that integrates tiny language models (TLMs) Our experimental results on the Berlin V2X dataset demonstrate that NNCodec achieves transparent compression while reducing communication overhead to below 1%.
arXiv Detail & Related papers (2025-04-02T17:54:06Z)
Exploring Neural Network Pruning with Screening Methods [3.443622476405787]
Modern deep learning models have tens of millions of parameters which makes the inference processes resource-intensive. This paper proposes and evaluates a network pruning framework that eliminates non-essential parameters. The proposed framework produces competitive lean networks compared to the original networks.
arXiv Detail & Related papers (2025-02-11T02:31:04Z)
Neuromorphic Wireless Split Computing with Multi-Level Spikes [69.73249913506042]
Neuromorphic computing uses spiking neural networks (SNNs) to perform inference tasks. embedding a small payload within each spike exchanged between spiking neurons can enhance inference accuracy without increasing energy consumption. split computing - where an SNN is partitioned across two devices - is a promising solution. This paper presents the first comprehensive study of a neuromorphic wireless split computing architecture that employs multi-level SNNs.
arXiv Detail & Related papers (2024-11-07T14:08:35Z)
Learning Load Balancing with GNN in MPTCP-Enabled Heterogeneous Networks [13.178956651532213]
We propose a graph neural network (GNN)-based model to tackle the LB problem for MP TCP-enabled HetNets. Compared to the conventional deep neural network (DNN), the proposed GNN-based model exhibits two key strengths.
arXiv Detail & Related papers (2024-10-22T15:49:53Z)
4D ASR: Joint Beam Search Integrating CTC, Attention, Transducer, and Mask Predict Decoders [53.297697898510194]
We propose a joint modeling scheme where four decoders share the same encoder -- we refer to this as 4D modeling. To efficiently train the 4D model, we introduce a two-stage training strategy that stabilizes multitask learning. In addition, we propose three novel one-pass beam search algorithms by combining three decoders.
arXiv Detail & Related papers (2024-06-05T05:18:20Z)
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices. We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers. Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z)
Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings. We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z)
Graph Neural Networks for Power Allocation in Wireless Networks with Full Duplex Nodes [10.150768420975155]
Due to mutual interference between users, power allocation problems in wireless networks are often non-trivial. Graph Graph neural networks (GNNs) have recently emerged as a promising approach tackling these problems and an approach exploits underlying topology of wireless networks.
arXiv Detail & Related papers (2023-03-27T10:59:09Z)
NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction. The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network. A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z)
SVNet: Where SO(3) Equivariance Meets Binarization on Point Cloud Representation [65.4396959244269]
The paper tackles the challenge by designing a general framework to construct 3D learning architectures. The proposed approach can be applied to general backbones like PointNet and DGCNN. Experiments on ModelNet40, ShapeNet, and the real-world dataset ScanObjectNN, demonstrated that the method achieves a great trade-off between efficiency, rotation, and accuracy.
arXiv Detail & Related papers (2022-09-13T12:12:19Z)
Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation [26.94528951545861]
A weighted multi-dilation depthwise-separable convolution is proposed to replace standard depthwise-separable convolutions in temporal convolutional networks (TCNs) It is shown that this weighted multi-dilation temporal convolutional network (WD-TCN) consistently outperforms the TCN across various model configurations.
arXiv Detail & Related papers (2022-05-17T15:56:31Z)
An Adaptive Device-Edge Co-Inference Framework Based on Soft Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices. We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations. Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z)
Hybrid SNN-ANN: Energy-Efficient Classification and Object Detection for Event-Based Vision [64.71260357476602]
Event-based vision sensors encode local pixel-wise brightness changes in streams of events rather than image frames. Recent progress in object recognition from event-based sensors has come from conversions of deep neural networks. We propose a hybrid architecture for end-to-end training of deep neural networks for event-based pattern recognition and object detection.
arXiv Detail & Related papers (2021-12-06T23:45:58Z)
Sequence-to-Sequence Load Disaggregation Using Multi-Scale Residual Neural Network [4.094944573107066]
Non-Intrusive Load Monitoring (NILM) has received more and more attention as a cost-effective way to monitor electricity. Deep neural networks has been shown a great potential in the field of load disaggregation.
arXiv Detail & Related papers (2020-09-25T17:41:28Z)
Fully Dynamic Inference with Deep Neural Networks [19.833242253397206]
Two compact networks, called Layer-Net (L-Net) and Channel-Net (C-Net), predict on a per-instance basis which layers or filters/channels are redundant and therefore should be skipped. On the CIFAR-10 dataset, LC-Net results in up to 11.9$times$ fewer floating-point operations (FLOPs) and up to 3.3% higher accuracy compared to other dynamic inference methods. On the ImageNet dataset, LC-Net achieves up to 1.4$times$ fewer FLOPs and up to 4.6% higher Top-1 accuracy than the other methods.
arXiv Detail & Related papers (2020-07-29T23:17:48Z)
DiffRNN: Differential Verification of Recurrent Neural Networks [3.4423518864863154]
Recurrent neural networks (RNNs) have become popular in a variety of applications such as image processing, data classification, speech recognition, and as controllers in autonomous systems. We propose DIFFRNN, the first differential verification method for RNNs to certify the equivalence of two structurally similar neural networks. We demonstrate the practical efficacy of our technique on a variety of benchmarks and show that DIFFRNN outperforms state-of-the-art verification tools such as POPQORN.
arXiv Detail & Related papers (2020-07-20T14:14:35Z)
Pre-Trained Models for Heterogeneous Information Networks [57.78194356302626]
We propose a self-supervised pre-training and fine-tuning framework, PF-HIN, to capture the features of a heterogeneous information network. PF-HIN consistently and significantly outperforms state-of-the-art alternatives on each of these tasks, on four datasets.
arXiv Detail & Related papers (2020-07-07T03:36:28Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Binarized Graph Neural Network [65.20589262811677]
We develop a binarized graph neural network to learn the binary representations of the nodes with binary network parameters. Our proposed method can be seamlessly integrated into the existing GNN-based embedding approaches. Experiments indicate that the proposed binarized graph neural network, namely BGN, is orders of magnitude more efficient in terms of both time and space.
arXiv Detail & Related papers (2020-04-19T09:43:14Z)
DeepLight: Deep Lightweight Feature Interactions for Accelerating CTR Predictions in Ad Serving [15.637357991632241]
Click-through rate (CTR) prediction is a crucial task in online display advertising. embedding-based neural networks have been proposed to learn both explicit feature interactions. These sophisticated models, however, slow down the prediction inference by at least hundreds of times.
arXiv Detail & Related papers (2020-02-17T14:51:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.