Related papers: Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation

Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation

URL: http://arxiv.org/abs/2202.10226v1
Date: Mon, 14 Feb 2022 07:55:57 GMT
Title: Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation
Authors: Rihan Chen, Bin Liu, Han Zhu, Yaoxuan Wang, Qi Li, Buting Ma, Qingbo Hua, Jun Jiang, Yunlong Xu, Hongbo Deng, Bo Zheng
Abstract summary: We propose a novel method to extend ANN search to arbitrary matching functions. Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items. The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase.
Score: 20.42993976179691
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Model-based methods for recommender systems have been studied extensively for years. Modern recommender systems usually resort to 1) representation learning models which define user-item preference as the distance between their embedding representations, and 2) embedding-based Approximate Nearest Neighbor (ANN) search to tackle the efficiency problem introduced by large-scale corpus. While providing efficient retrieval, the embedding-based retrieval pattern also limits the model capacity since the form of user-item preference measure is restricted to the distance between their embedding representations. However, for other more precise user-item preference measures, e.g., preference scores directly derived from a deep neural network, they are computationally intractable because of the lack of an efficient retrieval method, and an exhaustive search for all user-item pairs is impractical. In this paper, we propose a novel method to extend ANN search to arbitrary matching functions, e.g., a deep neural network. Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items. To solve the problem that the similarity measures of graph construction and user-item matching function are heterogeneous, we propose a pluggable adversarial training task to ensure the graph search with arbitrary matching function can achieve fairly high precision. Experimental results in both open source and industry datasets demonstrate the effectiveness of our method. The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase. We also summarize our detailed experiences in deployment in this paper.

Related papers

Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities. We show that Mixture-of-Logits (MoL) can be realized empirically to achieve superior performance on diverse retrieval scenarios.
arXiv Detail & Related papers (2024-07-22T08:19:34Z)
Beyond Two-Tower Matching: Learning Sparse Retrievable Cross-Interactions for Recommendation [80.19762472699814]
Two-tower models are a prevalent matching framework for recommendation, which have been widely deployed in industrial applications. It suffers two main challenges, including limited feature interaction capability and reduced accuracy in online serving. We propose a new matching paradigm named SparCode, which supports not only sophisticated feature interactions but also efficient retrieval.
arXiv Detail & Related papers (2023-11-30T03:13:36Z)
Compactness Score: A Fast Filter Method for Unsupervised Feature Selection [66.84571085643928]
We propose a fast unsupervised feature selection method, named as, Compactness Score (CSUFS) to select desired features. Our proposed algorithm seems to be more accurate and efficient compared with existing algorithms.
arXiv Detail & Related papers (2022-01-31T13:01:37Z)
Embracing Structure in Data for Billion-Scale Semantic Product Search [14.962039276966319]
We present principled approaches to train and deploy dyadic neural embedding models at the billion scale. We show that exploiting the natural structure of real-world datasets helps address both challenges efficiently.
arXiv Detail & Related papers (2021-10-12T16:14:13Z)
A concise method for feature selection via normalized frequencies [0.0]
In this paper, a concise method is proposed for universal feature selection. The proposed method uses a fusion of the filter method and the wrapper method, rather than a combination of them. The evaluation results show that the proposed method outperformed several state-of-the-art related works in terms of accuracy, precision, recall, F-score and AUC.
arXiv Detail & Related papers (2021-06-10T15:29:54Z)
Integrating Semantics and Neighborhood Information with Graph-Driven Generative Models for Document Retrieval [51.823187647843945]
In this paper, we encode the neighborhood information with a graph-induced Gaussian distribution, and propose to integrate the two types of information with a graph-driven generative model. Under the approximation, we prove that the training objective can be decomposed into terms involving only singleton or pairwise documents, enabling the model to be trained as efficiently as uncorrelated ones.
arXiv Detail & Related papers (2021-05-27T11:29:03Z)
IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings. We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data. We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z)
Diverse Knowledge Distillation for End-to-End Person Search [81.4926655119318]
Person search aims to localize and identify a specific person from a gallery of images. Recent methods can be categorized into two groups, i.e., two-step and end-to-end approaches. We propose a simple yet strong end-to-end network with diverse knowledge distillation to break the bottleneck.
arXiv Detail & Related papers (2020-12-21T09:04:27Z)
Adversarial Examples for $k$-Nearest Neighbor Classifiers Based on Higher-Order Voronoi Diagrams [69.4411417775822]
Adversarial examples are a widely studied phenomenon in machine learning models. We propose an algorithm for evaluating the adversarial robustness of $k$-nearest neighbor classification.
arXiv Detail & Related papers (2020-11-19T08:49:10Z)
NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG) Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset. We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.