Deep Retrieval: Learning A Retrievable Structure for Large-Scale
Recommendations
- URL: http://arxiv.org/abs/2007.07203v2
- Date: Tue, 18 May 2021 05:45:30 GMT
- Title: Deep Retrieval: Learning A Retrievable Structure for Large-Scale
Recommendations
- Authors: Weihao Gao, Xiangjun Fan, Chong Wang, Jiankai Sun, Kai Jia, Wenzhi
Xiao, Ruofan Ding, Xingyan Bin, Hui Yang, Xiaobing Liu
- Abstract summary: We present Deep Retrieval (DR), to learn a retrievable structure directly with user-item interaction data.
DR is among the first non-ANN algorithms successfully deployed at the scale of hundreds of millions of items for industrial recommendation systems.
- Score: 21.68175843347951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the core problems in large-scale recommendations is to retrieve top
relevant candidates accurately and efficiently, preferably in sub-linear time.
Previous approaches are mostly based on a two-step procedure: first learn an
inner-product model, and then use some approximate nearest neighbor (ANN)
search algorithm to find top candidates. In this paper, we present Deep
Retrieval (DR), to learn a retrievable structure directly with user-item
interaction data (e.g. clicks) without resorting to the Euclidean space
assumption in ANN algorithms. DR's structure encodes all candidate items into a
discrete latent space. Those latent codes for the candidates are model
parameters and learnt together with other neural network parameters to maximize
the same objective function. With the model learnt, a beam search over the
structure is performed to retrieve the top candidates for reranking.
Empirically, we first demonstrate that DR, with sub-linear computational
complexity, can achieve almost the same accuracy as the brute-force baseline on
two public datasets. Moreover, we show that, in a live production
recommendation system, a deployed DR approach significantly outperforms a
well-tuned ANN baseline in terms of engagement metrics. To the best of our
knowledge, DR is among the first non-ANN algorithms successfully deployed at
the scale of hundreds of millions of items for industrial recommendation
systems.
Related papers
- Discovering Data Structures: Nearest Neighbor Search and Beyond [18.774836778996544]
We propose a general framework for end-to-end learning of data structures.
Our framework adapts to the underlying data distribution and provides fine-grained control over query and space complexity.
We first apply this framework to the problem of nearest neighbor search.
arXiv Detail & Related papers (2024-11-05T16:50:54Z) - Approximate Nearest Neighbour Search on Dynamic Datasets: An Investigation [20.409659920455955]
Approximate k-Nearest Neighbour (ANN) methods are often used for mining information and aiding machine learning on large scale high-dimensional datasets.
For applications with static datasets, runtime constraints and dataset properties can be used to empirically select an ANN method with suitable operating characteristics.
Traditional evaluation approaches do not consider the computational costs of updating the index structure, as well as the rate and size of index updates.
arXiv Detail & Related papers (2024-04-30T06:21:44Z) - A Weighted K-Center Algorithm for Data Subset Selection [70.49696246526199]
Subset selection is a fundamental problem that can play a key role in identifying smaller portions of the training data.
We develop a novel factor 3-approximation algorithm to compute subsets based on the weighted sum of both k-center and uncertainty sampling objective functions.
arXiv Detail & Related papers (2023-12-17T04:41:07Z) - Learning To Dive In Branch And Bound [95.13209326119153]
We propose L2Dive to learn specific diving structurals with graph neural networks.
We train generative models to predict variable assignments and leverage the duality of linear programs to make diving decisions.
arXiv Detail & Related papers (2023-01-24T12:01:45Z) - On the Generalizability and Predictability of Recommender Systems [33.46314108814183]
We give the first large-scale study of recommender system approaches.
We create Reczilla, a meta-learning approach to recommender systems.
arXiv Detail & Related papers (2022-06-23T17:51:42Z) - Approximate Nearest Neighbor Search under Neural Similarity Metric for
Large-Scale Recommendation [20.42993976179691]
We propose a novel method to extend ANN search to arbitrary matching functions.
Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items.
The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase.
arXiv Detail & Related papers (2022-02-14T07:55:57Z) - Learning to Hash Robustly, with Guarantees [79.68057056103014]
In this paper, we design an NNS algorithm for the Hamming space that has worst-case guarantees essentially matching that of theoretical algorithms.
We evaluate the algorithm's ability to optimize for a given dataset both theoretically and practically.
Our algorithm has a 1.8x and 2.1x better recall on the worst-performing queries to the MNIST and ImageNet datasets.
arXiv Detail & Related papers (2021-08-11T20:21:30Z) - Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models.
The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning.
We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z) - Exploration in two-stage recommender systems [79.50534282841618]
Two-stage recommender systems are widely adopted in industry due to their scalability and maintainability.
A key challenge of this setup is that optimal performance of each stage in isolation does not imply optimal global performance.
We propose a method of synchronising the exploration strategies between the ranker and the nominators.
arXiv Detail & Related papers (2020-09-01T16:52:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.