Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in
Bing Sponsored Search
- URL: http://arxiv.org/abs/2202.06212v1
- Date: Sun, 13 Feb 2022 05:20:44 GMT
- Title: Uni-Retriever: Towards Learning The Unified Embedding Based Retriever in
Bing Sponsored Search
- Authors: Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Ruicheng Zheng,
Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Denvy Deng, Qi
Zhang, Xing Xie
- Abstract summary: We present a novel representation learning framework Uni-Retriever developed for Bing Search.
On one hand, the capability of making high-relevance retrieval is established by distilling knowledge from the relevance teacher model''
On the other hand, the capability of making high-CTR retrieval is optimized by learning to discriminate user's clicked ads from the entire corpus.
- Score: 26.765315779943265
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Embedding based retrieval (EBR) is a fundamental building block in many web
applications. However, EBR in sponsored search is distinguished from other
generic scenarios and technically challenging due to the need of serving
multiple retrieval purposes: firstly, it has to retrieve high-relevance ads,
which may exactly serve user's search intent; secondly, it needs to retrieve
high-CTR ads so as to maximize the overall user clicks. In this paper, we
present a novel representation learning framework Uni-Retriever developed for
Bing Search, which unifies two different training modes knowledge distillation
and contrastive learning to realize both required objectives. On one hand, the
capability of making high-relevance retrieval is established by distilling
knowledge from the ``relevance teacher model''. On the other hand, the
capability of making high-CTR retrieval is optimized by learning to
discriminate user's clicked ads from the entire corpus. The two training modes
are jointly performed as a multi-objective learning process, such that the ads
of high relevance and CTR can be favored by the generated embeddings. Besides
the learning strategy, we also elaborate our solution for EBR serving pipeline
built upon the substantially optimized DiskANN, where massive-scale EBR can be
performed with competitive time and memory efficiency, and accomplished in
high-quality. We make comprehensive offline and online experiments to evaluate
the proposed techniques, whose findings may provide useful insights for the
future development of EBR systems. Uni-Retriever has been mainstreamed as the
major retrieval path in Bing's production thanks to the notable improvements on
the representation and EBR serving quality.
Related papers
- Learning to Rank for Multiple Retrieval-Augmented Models through Iterative Utility Maximization [21.115495457454365]
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents.
We introduce an iterative approach where the search engine generates retrieval results for these RAG agents and gathers feedback on the quality of the retrieved documents during an offline phase.
We adapt this approach to an online setting, allowing the search engine to refine its behavior based on real-time individual agents feedback.
arXiv Detail & Related papers (2024-10-13T17:53:50Z) - Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning [49.3242278912771]
We introduce a novel multimodal RAG framework named RMR (Retrieval Meets Reasoning)
The RMR framework employs a bi-modal retrieval module to identify the most relevant question-answer pairs.
It significantly boosts the performance of various vision-language models across a spectrum of benchmark datasets.
arXiv Detail & Related papers (2024-05-31T14:23:49Z) - Retrieval-Oriented Knowledge for Click-Through Rate Prediction [29.55757862617378]
Click-through rate (CTR) prediction is crucial for personalized online services.
underlineretrieval-underlineoriented underlineknowledge (bfname) framework bypasses the real retrieval process.
name features a knowledge base that preserves and imitates the retrieved & aggregated representations.
arXiv Detail & Related papers (2024-04-28T20:21:03Z) - Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching [53.05954114863596]
We propose a brand-new Deep Boosting Learning (DBL) algorithm for image-text matching.
An anchor branch is first trained to provide insights into the data properties.
A target branch is concurrently tasked with more adaptive margin constraints to further enlarge the relative distance between matched and unmatched samples.
arXiv Detail & Related papers (2024-04-28T08:44:28Z) - REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain
Question Answering [122.62012375722124]
In existing methods, large language models (LLMs) cannot precisely assess the relevance of retrieved documents.
We propose REAR, a RElevance-Aware Retrieval-augmented approach for open-domain question answering (QA)
arXiv Detail & Related papers (2024-02-27T13:22:51Z) - Learning to Retrieve for Job Matching [22.007634436648427]
We discuss applying learning-to-retrieve technology to enhance LinkedIns job search and recommendation systems.
We leverage confirmed hire data to construct a graph that evaluates a seeker's qualification for a job, and utilize learned links for retrieval.
In addition to a solution based on a conventional inverted index, we developed an on-GPU solution capable of supporting both KNN and term matching efficiently.
arXiv Detail & Related papers (2024-02-21T00:05:25Z) - Generative Multi-Modal Knowledge Retrieval with Large Language Models [75.70313858231833]
We propose an innovative end-to-end generative framework for multi-modal knowledge retrieval.
Our framework takes advantage of the fact that large language models (LLMs) can effectively serve as virtual knowledge bases.
We demonstrate significant improvements ranging from 3.0% to 14.6% across all evaluation metrics when compared to strong baselines.
arXiv Detail & Related papers (2024-01-16T08:44:29Z) - Que2Engage: Embedding-based Retrieval for Relevant and Engaging Products
at Facebook Marketplace [15.054431410052851]
We present Que2Engage, a search EBR system built towards bridging the gap between retrieval and ranking for end-to-end optimizations.
We show the effectiveness of our approach via a multitask evaluation framework and thorough baseline comparisons and ablation studies.
arXiv Detail & Related papers (2023-02-21T23:10:16Z) - Retrieval Augmentation for Commonsense Reasoning: A Unified Approach [64.63071051375289]
We propose a unified framework of retrieval-augmented commonsense reasoning (called RACo)
Our proposed RACo can significantly outperform other knowledge-enhanced method counterparts.
arXiv Detail & Related papers (2022-10-23T23:49:08Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Building an Efficient and Effective Retrieval-based Dialogue System via
Mutual Learning [27.04857039060308]
We propose to combine the best of both worlds to build a retrieval system.
We employ a fast bi-encoder to replace the traditional feature-based pre-retrieval model.
We train the pre-retrieval model and the re-ranking model at the same time via mutual learning.
arXiv Detail & Related papers (2021-10-01T01:32:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.