Approximate Nearest Neighbor Search under Neural Similarity Metric for
Large-Scale Recommendation
- URL: http://arxiv.org/abs/2202.10226v1
- Date: Mon, 14 Feb 2022 07:55:57 GMT
- Title: Approximate Nearest Neighbor Search under Neural Similarity Metric for
Large-Scale Recommendation
- Authors: Rihan Chen, Bin Liu, Han Zhu, Yaoxuan Wang, Qi Li, Buting Ma, Qingbo
Hua, Jun Jiang, Yunlong Xu, Hongbo Deng, Bo Zheng
- Abstract summary: We propose a novel method to extend ANN search to arbitrary matching functions.
Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items.
The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase.
- Score: 20.42993976179691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-based methods for recommender systems have been studied extensively for
years. Modern recommender systems usually resort to 1) representation learning
models which define user-item preference as the distance between their
embedding representations, and 2) embedding-based Approximate Nearest Neighbor
(ANN) search to tackle the efficiency problem introduced by large-scale corpus.
While providing efficient retrieval, the embedding-based retrieval pattern also
limits the model capacity since the form of user-item preference measure is
restricted to the distance between their embedding representations. However,
for other more precise user-item preference measures, e.g., preference scores
directly derived from a deep neural network, they are computationally
intractable because of the lack of an efficient retrieval method, and an
exhaustive search for all user-item pairs is impractical. In this paper, we
propose a novel method to extend ANN search to arbitrary matching functions,
e.g., a deep neural network. Our main idea is to perform a greedy walk with a
matching function in a similarity graph constructed from all items. To solve
the problem that the similarity measures of graph construction and user-item
matching function are heterogeneous, we propose a pluggable adversarial
training task to ensure the graph search with arbitrary matching function can
achieve fairly high precision. Experimental results in both open source and
industry datasets demonstrate the effectiveness of our method. The proposed
method has been fully deployed in the Taobao display advertising platform and
brings a considerable advertising revenue increase. We also summarize our
detailed experiences in deployment in this paper.
Related papers
- Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities.
We show that Mixture-of-Logits (MoL) can be realized empirically to achieve superior performance on diverse retrieval scenarios.
arXiv Detail & Related papers (2024-07-22T08:19:34Z) - Beyond Two-Tower Matching: Learning Sparse Retrievable
Cross-Interactions for Recommendation [80.19762472699814]
Two-tower models are a prevalent matching framework for recommendation, which have been widely deployed in industrial applications.
It suffers two main challenges, including limited feature interaction capability and reduced accuracy in online serving.
We propose a new matching paradigm named SparCode, which supports not only sophisticated feature interactions but also efficient retrieval.
arXiv Detail & Related papers (2023-11-30T03:13:36Z) - Compactness Score: A Fast Filter Method for Unsupervised Feature
Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features.
Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z) - Embracing Structure in Data for Billion-Scale Semantic Product Search [14.962039276966319]
We present principled approaches to train and deploy dyadic neural embedding models at the billion scale.
We show that exploiting the natural structure of real-world datasets helps address both challenges efficiently.
arXiv Detail & Related papers (2021-10-12T16:14:13Z) - A concise method for feature selection via normalized frequencies [0.0]
In this paper, a concise method is proposed for universal feature selection.
The proposed method uses a fusion of the filter method and the wrapper method, rather than a combination of them.
The evaluation results show that the proposed method outperformed several state-of-the-art related works in terms of accuracy, precision, recall, F-score and AUC.
arXiv Detail & Related papers (2021-06-10T15:29:54Z) - Integrating Semantics and Neighborhood Information with Graph-Driven
Generative Models for Document Retrieval [51.823187647843945]
In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model.
Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones.
arXiv Detail & Related papers (2021-05-27T11:29:03Z) - IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings.
We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data.
We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z) - Diverse Knowledge Distillation for End-to-End Person Search [81.4926655119318]
Person search aims to localize and identify a specific person from a gallery of images.
Recent methods can be categorized into two groups, i.e., two-step and end-to-end approaches.
We propose a simple yet strong end-to-end network with diverse knowledge distillation to break the bottleneck.
arXiv Detail & Related papers (2020-12-21T09:04:27Z) - Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on
Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models.
We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.