Related papers: Revisiting Neural Retrieval on Accelerators

Revisiting Neural Retrieval on Accelerators

URL: http://arxiv.org/abs/2306.04039v1
Date: Tue, 6 Jun 2023 22:08:42 GMT
Title: Revisiting Neural Retrieval on Accelerators
Authors: Jiaqi Zhai, Zhaojie Gong, Yueming Wang, Xiao Sun, Zheng Yan, Fu Li, Xing Liu
Abstract summary: A key component of retrieval is to model (user, item) similarity. Despite its popularity, dot products cannot capture complex user-item interactions, which are multifaceted and likely high rank. We propose textitmixture of logits (MoL), which models (user, item) similarity as an adaptive composition of elementary similarity functions.
Score: 20.415728886298915
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Retrieval finds a small number of relevant candidates from a large corpus for information retrieval and recommendation applications. A key component of retrieval is to model (user, item) similarity, which is commonly represented as the dot product of two learned embeddings. This formulation permits efficient inference, commonly known as Maximum Inner Product Search (MIPS). Despite its popularity, dot products cannot capture complex user-item interactions, which are multifaceted and likely high rank. We hence examine non-dot-product retrieval settings on accelerators, and propose \textit{mixture of logits} (MoL), which models (user, item) similarity as an adaptive composition of elementary similarity functions. This new formulation is expressive, capable of modeling high rank (user, item) interactions, and further generalizes to the long tail. When combined with a hierarchical retrieval strategy, \textit{h-indexer}, we are able to scale up MoL to 100M corpus on a single GPU with latency comparable to MIPS baselines. On public datasets, our approach leads to uplifts of up to 77.3\% in hit rate (HR). Experiments on a large recommendation surface at Meta showed strong metric gains and reduced popularity bias, validating the proposed approach's performance and improved generalization.

Related papers

OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment [9.99840965933561]
We propose OneRec, which replaces the cascaded learning framework with a unified generative model. OneRec includes: 1) an encoder-decoder structure, which encodes the user's historical behavior sequences and gradually decodes the videos that the user may be interested in.
arXiv Detail & Related papers (2025-02-26T09:25:10Z)
Unifying Generative and Dense Retrieval for Sequential Recommendation [37.402860622707244]
We propose LIGER, a hybrid model that combines the strengths of sequential dense retrieval and generative retrieval. LIGER integrates sequential dense retrieval into generative retrieval, mitigating performance differences and enhancing cold-start item recommendation. This hybrid approach provides insights into the trade-offs between these approaches and demonstrates improvements in efficiency and effectiveness for recommendation systems in small-scale benchmarks.
arXiv Detail & Related papers (2024-11-27T23:36:59Z)
Improved Diversity-Promoting Collaborative Metric Learning for Recommendation [127.08043409083687]
Collaborative Metric Learning (CML) has recently emerged as a popular method in recommendation systems. This paper focuses on a challenging scenario where a user has multiple categories of interests. We propose a novel method called textitDiversity-Promoting Collaborative Metric Learning (DPCML)
arXiv Detail & Related papers (2024-09-02T07:44:48Z)
Retrieval with Learned Similarities [2.729516456192901]
State-of-the-art retrieval algorithms have migrated to learned similarities. We show that Mixture-of-Logits (MoL) can be realized empirically to achieve superior performance on diverse retrieval scenarios.
arXiv Detail & Related papers (2024-07-22T08:19:34Z)
ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval. ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z)
Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning [16.287067991245962]
In real-world systems, an important consideration for a new model is novelty of its top-k recommendations. We propose a reinforcement learning (RL) formulation where large language models provide feedback for the novel items. We evaluate the proposed algorithm on improving novelty for a query-ad recommendation task on a large-scale search engine.
arXiv Detail & Related papers (2024-06-20T10:20:02Z)
MMGRec: Multimodal Generative Recommendation with Transformer Model [81.61896141495144]
MMGRec aims to introduce a generative paradigm into multimodal recommendation. We first devise a hierarchical quantization method Graph CF-RQVAE to assign Rec-ID for each item from its multimodal information. We then train a Transformer-based recommender to generate the Rec-IDs of user-preferred items based on historical interaction sequences.
arXiv Detail & Related papers (2024-04-25T12:11:27Z)
Learnable Pillar-based Re-ranking for Image-Text Retrieval [119.9979224297237]
Image-text retrieval aims to bridge the modality gap and retrieve cross-modal content based on semantic similarities. Re-ranking, a popular post-processing practice, has revealed the superiority of capturing neighbor relations in single-modality retrieval tasks. We propose a novel learnable pillar-based re-ranking paradigm for image-text retrieval.
arXiv Detail & Related papers (2023-04-25T04:33:27Z)
kNN-Embed: Locally Smoothed Embedding Mixtures For Multi-interest Candidate Retrieval [7.681386867564213]
kNN-Embed represents each user as a smoothed mixture over learned item clusters that represent distinct "interests" of the user. We experimentally compare kNN-Embed to standard ANN candidate retrieval, and show significant improvements in overall recall and improved diversity across three datasets.
arXiv Detail & Related papers (2022-05-12T16:42:24Z)
Learning Personalized Item-to-Item Recommendation Metric via Implicit Feedback [24.37151414523712]
This paper studies the item-to-item recommendation problem in recommender systems from a new perspective of metric learning via implicit feedback. We develop and investigate a personalizable deep metric model that captures both the internal contents of items and how they were interacted with by users.
arXiv Detail & Related papers (2022-03-18T18:08:57Z)
Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation [20.42993976179691]
We propose a novel method to extend ANN search to arbitrary matching functions. Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items. The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase.
arXiv Detail & Related papers (2022-02-14T07:55:57Z)
IRLI: Iterative Re-partitioning for Learning to Index [104.72641345738425]
Methods have to trade between obtaining high accuracy while maintaining load balance and scalability in distributed settings. We propose a novel approach called IRLI, which iteratively partitions the items by learning the relevant buckets directly from the query-item relevance data. We mathematically show that IRLI retrieves the correct item with high probability under very natural assumptions and provides superior load balancing.
arXiv Detail & Related papers (2021-03-17T23:13:25Z)
Sparse-Interest Network for Sequential Recommendation [78.83064567614656]
We propose a novel textbfSparse textbfInterest textbfNEtwork (SINE) for sequential recommendation. Our sparse-interest module can adaptively infer a sparse set of concepts for each user from the large concept pool. SINE can achieve substantial improvement over state-of-the-art methods.
arXiv Detail & Related papers (2021-02-18T11:03:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.