Zero-Shot Heterogeneous Transfer Learning from Recommender Systems to
Cold-Start Search Retrieval
- URL: http://arxiv.org/abs/2008.02930v2
- Date: Wed, 19 Aug 2020 00:16:40 GMT
- Title: Zero-Shot Heterogeneous Transfer Learning from Recommender Systems to
Cold-Start Search Retrieval
- Authors: Tao Wu, Ellie Ka-In Chio, Heng-Tze Cheng, Yu Du, Steffen Rendle, Dima
Kuzmin, Ritesh Agarwal, Li Zhang, John Anderson, Sarvjeet Singh, Tushar
Chandra, Ed H. Chi, Wen Li, Ankit Kumar, Xiang Ma, Alex Soares, Nitin Jindal,
Pei Cao
- Abstract summary: We propose a new Zero-Shot Heterogeneous Transfer Learning framework that transfers learned knowledge from the recommender system component to improve the search component of a content platform.
We conduct online and offline experiments on one of the world's largest search and recommender systems from Google, and present the results and lessons learned.
- Score: 30.95373255143698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many recent advances in neural information retrieval models, which predict
top-K items given a query, learn directly from a large training set of (query,
item) pairs. However, they are often insufficient when there are many
previously unseen (query, item) combinations, often referred to as the cold
start problem. Furthermore, the search system can be biased towards items that
are frequently shown to a query previously, also known as the 'rich get richer'
(a.k.a. feedback loop) problem. In light of these problems, we observed that
most online content platforms have both a search and a recommender system that,
while having heterogeneous input spaces, can be connected through their common
output item space and a shared semantic representation. In this paper, we
propose a new Zero-Shot Heterogeneous Transfer Learning framework that
transfers learned knowledge from the recommender system component to improve
the search component of a content platform. First, it learns representations of
items and their natural-language features by predicting (item, item)
correlation graphs derived from the recommender system as an auxiliary task.
Then, the learned representations are transferred to solve the target search
retrieval task, performing query-to-item prediction without having seen any
(query, item) pairs in training. We conduct online and offline experiments on
one of the world's largest search and recommender systems from Google, and
present the results and lessons learned. We demonstrate that the proposed
approach can achieve high performance on offline search retrieval tasks, and
more importantly, achieved significant improvements on relevance and user
interactions over the highly-optimized production system in online experiments.
Related papers
- BayesCNS: A Unified Bayesian Approach to Address Cold Start and Non-Stationarity in Search Systems at Scale [1.1634177851893535]
BayesCNS is designed to handle cold start and non-stationary distribution shifts in search systems at scale.
BayesCNS achieves this by estimating prior distributions for user-item interactions, which are continuously updated with new user interactions gathered online.
This online learning procedure is guided by a ranker model, enabling efficient exploration of relevant items using contextual information.
arXiv Detail & Related papers (2024-10-03T01:14:30Z) - Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users [5.224122150536595]
We propose a hybrid multi-task learning approach, training on user-item and item-item interactions.
Our approach allows the model to better understand the relationships between entities within the knowledge graph by utilizing semantic information from text.
arXiv Detail & Related papers (2024-03-27T15:11:00Z) - End-to-end Knowledge Retrieval with Multi-modal Queries [50.01264794081951]
ReMuQ requires a system to retrieve knowledge from a large corpus by integrating contents from both text and image queries.
We introduce a retriever model ReViz'' that can directly process input text and images to retrieve relevant knowledge in an end-to-end fashion.
We demonstrate superior performance in retrieval on two datasets under zero-shot settings.
arXiv Detail & Related papers (2023-06-01T08:04:12Z) - Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog [42.088274728084265]
Retrieving proper domain knowledge from an external database lies at the heart of end-to-end task-oriented dialog systems.
Most existing systems blend knowledge retrieval with response generation and optimize them with direct supervision from reference responses.
We propose to decouple knowledge retrieval from response generation and introduce a multi-grained knowledge retriever.
arXiv Detail & Related papers (2023-05-17T12:12:46Z) - Exploring Effective Factors for Improving Visual In-Context Learning [56.14208975380607]
In-Context Learning (ICL) is to understand a new task via a few demonstrations (aka. prompt) and predict new inputs without tuning the models.
This paper shows that prompt selection and prompt fusion are two major factors that have a direct impact on the inference performance of visual context learning.
We propose a simple framework prompt-SelF for visual in-context learning.
arXiv Detail & Related papers (2023-04-10T17:59:04Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Deep Reinforcement Agent for Efficient Instant Search [14.086339486783018]
We propose to address the load issue by identifying tokens that are semantically more salient towards retrieving relevant documents.
We train a reinforcement agent that interacts directly with the search engine and learns to predict the word's importance.
A novel evaluation framework is presented to study the trade-off between the number of triggered searches and the system's performance.
arXiv Detail & Related papers (2022-03-17T22:47:15Z) - Sequential Search with Off-Policy Reinforcement Learning [48.88165680363482]
We propose a highly scalable hybrid learning model that consists of an RNN learning framework and an attention model.
As a novel optimization step, we fit multiple short user sequences in a single RNN pass within a training batch, by solving a greedy knapsack problem on the fly.
We also explore the use of off-policy reinforcement learning in multi-session personalized search ranking.
arXiv Detail & Related papers (2022-02-01T06:52:40Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - Connecting Images through Time and Sources: Introducing Low-data,
Heterogeneous Instance Retrieval [3.6526118822907594]
We show that it is not trivial to pick features responding well to a panel of variations and semantic content.
Introducing a new enhanced version of the Alegoria benchmark, we compare descriptors using the detailed annotations.
arXiv Detail & Related papers (2021-03-19T10:54:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.