Hierarchical Structured Neural Network for Retrieval
- URL: http://arxiv.org/abs/2408.06653v2
- Date: Fri, 25 Oct 2024 20:11:56 GMT
- Title: Hierarchical Structured Neural Network for Retrieval
- Authors: Kaushik Rangadurai, Siyang Yuan, Minhui Huang, Yiqun Liu, Golnaz Ghasemiesfeh, Yunchen Pu, Xinfeng Xie, Xingfeng He, Fangzhou Xu, Andrew Cui, Vidhoon Viswanathan, Yan Dong, Liang Xiong, Lin Yang, Liang Wang, Jiyan Yang, Chonglin Sun,
- Abstract summary: This paper presents Hierarchical Structured Neural Network (HSNN), a deployed jointly optimized hierarchical clustering and neural network model.
HSNN has been successfully deployed into the Ads Recommendation system and is currently handling major portion of the traffic.
The paper shares our experience in developing this system, dealing with challenges like freshness, volatility, cold start recommendations, cluster collapse and lessons deploying the model in a large scale retrieval production system.
- Score: 16.364477888946478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Embedding Based Retrieval (EBR) is a crucial component of the retrieval stage in (Ads) Recommendation System that utilizes Two Tower or Siamese Networks to learn embeddings for both users and items (ads). It then employs an Approximate Nearest Neighbor Search (ANN) to efficiently retrieve the most relevant ads for a specific user. Despite the recent rise to popularity in the industry, they have a couple of limitations. Firstly, Two Tower model architecture uses a single dot product interaction which despite their efficiency fail to capture the data distribution in practice. Secondly, the centroid representation and cluster assignment, which are components of ANN, occur after the training process has been completed. As a result, they do not take into account the optimization criteria used for retrieval model. In this paper, we present Hierarchical Structured Neural Network (HSNN), a deployed jointly optimized hierarchical clustering and neural network model that can take advantage of sophisticated interactions and model architectures that are more common in the ranking stages while maintaining a sub-linear inference cost. We achieve 6.5% improvement in offline evaluation and also demonstrate 1.22% online gains through A/B experiments. HSNN has been successfully deployed into the Ads Recommendation system and is currently handling major portion of the traffic. The paper shares our experience in developing this system, dealing with challenges like freshness, volatility, cold start recommendations, cluster collapse and lessons deploying the model in a large scale retrieval production system.
Related papers
- Rethinking Large-scale Pre-ranking System: Entire-chain Cross-domain
Models [0.0]
Existing pre-ranking approaches mainly endure sample selection bias problem owing to ignoring the entire-chain data dependence.
We propose Entire-chain Cross-domain Models (ECM), which leverage samples from the whole cascaded stages to effectively alleviate SSB problem.
We also propose a fine-grained neural structure named ECMM to further improve the pre-ranking accuracy.
arXiv Detail & Related papers (2023-10-12T05:14:42Z) - Neural Attentive Circuits [93.95502541529115]
We introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs)
NACs learn the parameterization and a sparse connectivity of neural modules without using domain knowledge.
NACs achieve an 8x speedup at inference time while losing less than 3% performance.
arXiv Detail & Related papers (2022-10-14T18:00:07Z) - NASRec: Weight Sharing Neural Architecture Search for Recommender
Systems [40.54254555949057]
We propose NASRec, a paradigm that trains a single supernet and efficiently produces abundant models/sub-architectures by weight sharing.
Our results on three Click-Through Rates (CTR) prediction benchmarks show that NASRec can outperform both manually designed models and existing NAS methods.
arXiv Detail & Related papers (2022-07-14T20:15:11Z) - Binary Early-Exit Network for Adaptive Inference on Low-Resource Devices [3.591566487849146]
Binary neural networks (BNNs) tackle the issue with extreme compression and speed-up gains compared to real-valued models.
We propose a simple but effective method to accelerate inference through unifying BNNs with an early-exiting strategy.
Our approach allows simple instances to exit early based on a decision threshold and utilizes output layers added to different intermediate layers to avoid executing the entire binary model.
arXiv Detail & Related papers (2022-06-17T22:11:11Z) - Learning Connectivity-Maximizing Network Configurations [123.01665966032014]
We propose a supervised learning approach with a convolutional neural network (CNN) that learns to place communication agents from an expert.
We demonstrate the performance of our CNN on canonical line and ring topologies, 105k randomly generated test cases, and larger teams not seen during training.
After training, our system produces connected configurations 2 orders of magnitude faster than the optimization-based scheme for teams of 10-20 agents.
arXiv Detail & Related papers (2021-12-14T18:59:01Z) - Learning-To-Ensemble by Contextual Rank Aggregation in E-Commerce [8.067201256886733]
We propose a new Learning-To-Ensemble framework RAEGO, which replaces the ensemble model with a contextual Rank Aggregator.
RA-EGO has been deployed in our online system and has improved the revenue significantly.
arXiv Detail & Related papers (2021-07-19T03:24:06Z) - CorDEL: A Contrastive Deep Learning Approach for Entity Linkage [70.82533554253335]
Entity linkage (EL) is a critical problem in data cleaning and integration.
With the ever-increasing growth of new data, deep learning (DL) based approaches have been proposed to alleviate the high cost of EL associated with the traditional models.
We argue that the twin-network architecture is sub-optimal to EL, leading to inherent drawbacks of existing models.
arXiv Detail & Related papers (2020-09-15T16:33:05Z) - Deep Retrieval: Learning A Retrievable Structure for Large-Scale
Recommendations [21.68175843347951]
We present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data.
DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.
arXiv Detail & Related papers (2020-07-12T06:23:51Z) - DC-NAS: Divide-and-Conquer Neural Architecture Search [108.57785531758076]
We present a divide-and-conquer (DC) approach to effectively and efficiently search deep neural architectures.
We achieve a $75.1%$ top-1 accuracy on the ImageNet dataset, which is higher than that of state-of-the-art methods using the same search space.
arXiv Detail & Related papers (2020-05-29T09:02:16Z) - Binarizing MobileNet via Evolution-based Searching [66.94247681870125]
We propose a use of evolutionary search to facilitate the construction and training scheme when binarizing MobileNet.
Inspired by one-shot architecture search frameworks, we manipulate the idea of group convolution to design efficient 1-Bit Convolutional Neural Networks (CNNs)
Our objective is to come up with a tiny yet efficient binary neural architecture by exploring the best candidates of the group convolution.
arXiv Detail & Related papers (2020-05-13T13:25:51Z) - Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale.
Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous.
We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.