Beyond Two-Tower Matching: Learning Sparse Retrievable
Cross-Interactions for Recommendation
- URL: http://arxiv.org/abs/2311.18213v1
- Date: Thu, 30 Nov 2023 03:13:36 GMT
- Title: Beyond Two-Tower Matching: Learning Sparse Retrievable
Cross-Interactions for Recommendation
- Authors: Liangcai Su, Fan Yan, Jieming Zhu, Xi Xiao, Haoyi Duan, Zhou Zhao,
Zhenhua Dong, Ruiming Tang
- Abstract summary: Two-tower models are a prevalent matching framework for recommendation, which have been widely deployed in industrial applications.
It suffers two main challenges, including limited feature interaction capability and reduced accuracy in online serving.
We propose a new matching paradigm named SparCode, which supports not only sophisticated feature interactions but also efficient retrieval.
- Score: 80.19762472699814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Two-tower models are a prevalent matching framework for recommendation, which
have been widely deployed in industrial applications. The success of two-tower
matching attributes to its efficiency in retrieval among a large number of
items, since the item tower can be precomputed and used for fast Approximate
Nearest Neighbor (ANN) search. However, it suffers two main challenges,
including limited feature interaction capability and reduced accuracy in online
serving. Existing approaches attempt to design novel late interactions instead
of dot products, but they still fail to support complex feature interactions or
lose retrieval efficiency. To address these challenges, we propose a new
matching paradigm named SparCode, which supports not only sophisticated feature
interactions but also efficient retrieval. Specifically, SparCode introduces an
all-to-all interaction module to model fine-grained query-item interactions.
Besides, we design a discrete code-based sparse inverted index jointly trained
with the model to achieve effective and efficient model inference. Extensive
experiments have been conducted on open benchmark datasets to demonstrate the
superiority of our framework. The results show that SparCode significantly
improves the accuracy of candidate item matching while retaining the same level
of retrieval efficiency with two-tower models. Our source code will be
available at MindSpore/models.
Related papers
- List-aware Reranking-Truncation Joint Model for Search and
Retrieval-augmented Generation [80.12531449946655]
We propose a Reranking-Truncation joint model (GenRT) that can perform the two tasks concurrently.
GenRT integrates reranking and truncation via generative paradigm based on encoder-decoder architecture.
Our method achieves SOTA performance on both reranking and truncation tasks for web search and retrieval-augmented LLMs.
arXiv Detail & Related papers (2024-02-05T06:52:53Z) - Efficient and Joint Hyperparameter and Architecture Search for
Collaborative Filtering [31.25094171513831]
We propose a two-stage search algorithm for Collaborative Filtering models.
In the first stage, we leverage knowledge from subsampled datasets to reduce evaluation costs.
In the second stage, we efficiently fine-tune top candidate models on the whole dataset.
arXiv Detail & Related papers (2023-07-12T10:56:25Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement [80.94378602238432]
We propose an efficient structure named Correspondence Efficient Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner.
To achieve this, multiple transformer blocks are stage-wisely connected to gradually refine the predicted coordinates.
Experiments on various sparse and dense matching tasks demonstrate the superiority of our method in both efficiency and effectiveness against existing state-of-the-arts.
arXiv Detail & Related papers (2022-09-25T13:05:33Z) - Approximate Nearest Neighbor Search under Neural Similarity Metric for
Large-Scale Recommendation [20.42993976179691]
We propose a novel method to extend ANN search to arbitrary matching functions.
Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items.
The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase.
arXiv Detail & Related papers (2022-02-14T07:55:57Z) - Building an Efficient and Effective Retrieval-based Dialogue System via
Mutual Learning [27.04857039060308]
We propose to combine the best of both worlds to build a retrieval system.
We employ a fast bi-encoder to replace the traditional feature-based pre-retrieval model.
We train the pre-retrieval model and the re-ranking model at the same time via mutual learning.
arXiv Detail & Related papers (2021-10-01T01:32:33Z) - Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for
Improved Cross-Modal Retrieval [80.35589927511667]
Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.
We propose a novel fine-tuning framework which turns any pretrained text-image multi-modal model into an efficient retrieval model.
Our experiments on a series of standard cross-modal retrieval benchmarks in monolingual, multilingual, and zero-shot setups, demonstrate improved accuracy and huge efficiency benefits over the state-of-the-art cross-encoders.
arXiv Detail & Related papers (2021-03-22T15:08:06Z) - Learning to Match Jobs with Resumes from Sparse Interaction Data using
Multi-View Co-Teaching Network [83.64416937454801]
Job-resume interaction data is sparse and noisy, which affects the performance of job-resume match algorithms.
We propose a novel multi-view co-teaching network from sparse interaction data for job-resume matching.
Our model is able to outperform state-of-the-art methods for job-resume matching.
arXiv Detail & Related papers (2020-09-25T03:09:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.