Related papers: DCN^2: Interplay of Implicit Collision Weights and Explicit Cross Layers for Large-Scale Recommendation

DCN^2: Interplay of Implicit Collision Weights and Explicit Cross Layers for Large-Scale Recommendation

URL: http://arxiv.org/abs/2506.21624v1
Date: Tue, 24 Jun 2025 06:44:42 GMT
Title: DCN^2: Interplay of Implicit Collision Weights and Explicit Cross Layers for Large-Scale Recommendation
Authors: Blaž Škrlj, Yonatan Karni, Grega Gašperšič, Blaž Mramor, Yulia Stolin, Martin Jakomin, Jasna Urbančič, Yuval Dishi, Natalia Silberstein, Ophir Friedler, Assaf Klein,
Abstract summary: We introduce three significant algorithmic improvements to the DCNv2 architecture, detailing their formulation and behavior at scale.<n>The enhanced architecture we refer to as DCN2 is actively used in a live recommender system, processing over 0.5 billion predictions per second across diverse use cases.<n>These improvements effectively address key limitations observed in the DCNv2, including information loss in Cross layers, implicit management of collisions through learnable lookup-level weights, and explicit modeling of pairwise similarities with a custom layer that emulates FFMs' behavior.
Score: 1.1027313935007121
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Deep and Cross architecture (DCNv2) is a robust production baseline and is integral to numerous real-life recommender systems. Its inherent efficiency and ability to model interactions often result in models that are both simpler and highly competitive compared to more computationally demanding alternatives, such as Deep FFMs. In this work, we introduce three significant algorithmic improvements to the DCNv2 architecture, detailing their formulation and behavior at scale. The enhanced architecture we refer to as DCN^2 is actively used in a live recommender system, processing over 0.5 billion predictions per second across diverse use cases where it out-performed DCNv2, both offline and online (ab tests). These improvements effectively address key limitations observed in the DCNv2, including information loss in Cross layers, implicit management of collisions through learnable lookup-level weights, and explicit modeling of pairwise similarities with a custom layer that emulates FFMs' behavior. The superior performance of DCN^2 is also demonstrated on four publicly available benchmark data sets.

Related papers

MoCa: Modality-aware Continual Pre-training Makes Better Bidirectional Multimodal Embeddings [75.0617088717528]
MoCa is a framework for transforming pre-trained VLM backbones into effective bidirectional embedding models.<n>MoCa consistently improves performance across MMEB and ViDoRe-v2 benchmarks, achieving new state-of-the-art results.
arXiv Detail & Related papers (2025-06-29T06:41:00Z)
Factorized Implicit Global Convolution for Automotive Computational Fluid Dynamics Prediction [52.32698071488864]
We propose Factorized Implicit Global Convolution (FIGConv), a novel architecture that efficiently solves CFD problems for very large 3D meshes.<n>FIGConv achieves quadratic complexity $O(N2)$, a significant improvement over existing 3D neural CFD models.<n>We validate our approach on the industry-standard Ahmed body dataset and the large-scale DrivAerNet dataset.
arXiv Detail & Related papers (2025-02-06T18:57:57Z)
VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction [21.061630022134203]
In-Context Operator Networks (ICONs) learn operators across diverse partial differential equations using few-shot, in-context learning.<n>Existing ICONs process each spatial point as an individual token, severely limiting computational efficiency when handling dense data in higher spatial dimensions.<n>We propose Vision In-Context Operator Networks (VICON), which integrates vision transformer architectures to efficiently process 2D data through patch-wise operations.
arXiv Detail & Related papers (2024-11-25T03:25:17Z)
POMONAG: Pareto-Optimal Many-Objective Neural Architecture Generator [4.09225917049674]
Transferable NAS has emerged, generalizing the search process from dataset-dependent to task-dependent. This paper introduces POMONAG, extending DiffusionNAG via a many-optimal diffusion process. Results were validated on two search spaces -- NAS201 and MobileNetV3 -- and evaluated across 15 image classification datasets.
arXiv Detail & Related papers (2024-09-30T16:05:29Z)
FCN: Fusing Exponential and Linear Cross Network for Click-Through Rate Prediction [17.19859591493946]
This paper proposes a novel model, called Fusing Cross Network (FCN), along with two sub-networks: Linear Cross Network (LCN) and Exponential Cross Network (ECN)<n>FCN explicitly captures feature interactions with both linear and exponential growth, eliminating the need to rely on implicit DNN.<n>We evaluate the effectiveness, efficiency, and interpretability of FCN on six benchmark datasets.
arXiv Detail & Related papers (2024-07-18T09:49:13Z)
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network [55.21288428359509]
Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices. We propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers. Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks.
arXiv Detail & Related papers (2024-05-27T10:44:05Z)
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications [108.44482683870888]
We introduce Deformable Convolution v4 (DCNv4), a highly efficient and effective operator designed for a broad spectrum of vision applications. DCNv4 addresses the limitations of its predecessor, DCNv3, with two key enhancements. It demonstrates exceptional performance across various tasks, including image classification, instance and semantic segmentation, and notably, image generation.
arXiv Detail & Related papers (2024-01-11T14:53:24Z)
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation [80.33846577924363]
We present All-Pairs Multi-Field Transforms (AMT), a new network architecture for video framegithub. It is based on two essential designs. First, we build bidirectional volumes for all pairs of pixels, and use the predicted bilateral flows to retrieve correlations. Second, we derive multiple groups of fine-grained flow fields from one pair of updated coarse flows for performing backward warping on the input frames separately.
arXiv Detail & Related papers (2023-04-19T16:18:47Z)
Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z)
DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems [15.398542784403604]
Deep & Cross Network (DCN) was proposed to automatically and efficiently learn bounded-degree predictive feature interactions. We propose an improved framework DCN-V2 to make DCN more practical in large-scale industrial settings.
arXiv Detail & Related papers (2020-08-19T20:33:02Z)
NSGANetV2: Evolutionary Multi-Objective Surrogate-Assisted Neural Architecture Search [22.848528877480796]
We propose an efficient NAS algorithm for generating task-specific models that are competitive under multiple competing objectives. It comprises of two surrogates, one at the architecture level to improve sample efficiency and one at the weights level, through a supernet, to improve gradient descent training efficiency. We demonstrate the effectiveness and versatility of the proposed method on six diverse non-standard datasets.
arXiv Detail & Related papers (2020-07-20T18:30:11Z)
Searching Central Difference Convolutional Networks for Face Anti-Spoofing [68.77468465774267]
Face anti-spoofing (FAS) plays a vital role in face recognition systems. Most state-of-the-art FAS methods rely on stacked convolutions and expert-designed network. Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC)
arXiv Detail & Related papers (2020-03-09T12:48:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.