Improving Retrieval in Sponsored Search by Leveraging Query Context Signals
- URL: http://arxiv.org/abs/2407.14346v2
- Date: Fri, 18 Oct 2024 13:59:54 GMT
- Title: Improving Retrieval in Sponsored Search by Leveraging Query Context Signals
- Authors: Akash Kumar Mohankumar, Gururaj K, Gagan Madan, Amit Singh,
- Abstract summary: We propose an approach to enhance query understanding by augmenting queries with rich contextual signals.
We use web search titles and snippets to ground queries in real-world information and utilize GPT-4 to generate query rewrites and explanations.
Our context-aware approach substantially outperforms context-free models.
- Score: 6.152499434499752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Accurately retrieving relevant bid keywords for user queries is critical in Sponsored Search but remains challenging, particularly for short, ambiguous queries. Existing dense and generative retrieval models often fail to capture nuanced user intent in these cases. To address this, we propose an approach to enhance query understanding by augmenting queries with rich contextual signals derived from web search results and large language models, stored in an online cache. Specifically, we use web search titles and snippets to ground queries in real-world information and utilize GPT-4 to generate query rewrites and explanations that clarify user intent. These signals are efficiently integrated through a Fusion-in-Decoder based Unity architecture, enabling both dense and generative retrieval with serving costs on par with traditional context-free models. To address scenarios where context is unavailable in the cache, we introduce context glancing, a curriculum learning strategy that improves model robustness and performance even without contextual signals during inference. Extensive offline experiments demonstrate that our context-aware approach substantially outperforms context-free models. Furthermore, online A/B testing on a prominent search engine across 160+ countries shows significant improvements in user engagement and revenue.
Related papers
- Leveraging Inter-Chunk Interactions for Enhanced Retrieval in Large Language Model-Based Question Answering [12.60063463163226]
IIER captures the internal connections between document chunks by considering three types of interactions: structural, keyword, and semantic.
It identifies multiple seed nodes based on the target question and iteratively searches for relevant chunks to gather supporting evidence.
It refines the context and reasoning chain, aiding the large language model in reasoning and answer generation.
arXiv Detail & Related papers (2024-08-06T02:39:55Z) - Generative Retrieval with Preference Optimization for E-commerce Search [16.78829577915103]
We develop an innovative framework for E-commerce search, called generative retrieval with preference optimization.
We employ multi-span identifiers to represent raw item titles and transform the task of generating titles from queries into the task of generating multi-span identifiers from queries.
Our experiments show that this framework achieves competitive performance on a real-world dataset, and online A/B tests demonstrate the superiority and effectiveness in improving conversion gains.
arXiv Detail & Related papers (2024-07-29T09:31:19Z) - ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval.
ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z) - Redefining Information Retrieval of Structured Database via Large Language Models [10.117751707641416]
This paper introduces a novel retrieval augmentation framework called ChatLR.
It primarily employs the powerful semantic understanding ability of Large Language Models (LLMs) as retrievers to achieve precise and concise information retrieval.
Experimental results demonstrate the effectiveness of ChatLR in addressing user queries, achieving an overall information retrieval accuracy exceeding 98.8%.
arXiv Detail & Related papers (2024-05-09T02:37:53Z) - Enhancing Retrieval Processes for Language Generation with Augmented
Queries [0.0]
This research focuses on addressing this issue through Retrieval-Augmented Generation (RAG), a technique that guides models to give accurate responses based on real facts.
To overcome scalability issues, the study explores connecting user queries with sophisticated language models such as BERT and Orca2.
The empirical results indicate a significant improvement in the initial language model's performance under RAG.
arXiv Detail & Related papers (2024-02-06T13:19:53Z) - Building Interpretable and Reliable Open Information Retriever for New
Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA)
We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query.
We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z) - CAPSTONE: Curriculum Sampling for Dense Retrieval with Document
Expansion [68.19934563919192]
We propose a curriculum sampling strategy that utilizes pseudo queries during training and progressively enhances the relevance between the generated query and the real query.
Experimental results on both in-domain and out-of-domain datasets demonstrate that our approach outperforms previous dense retrieval models.
arXiv Detail & Related papers (2022-12-18T15:57:46Z) - Context-Aware Query Rewriting for Improving Users' Search Experience on
E-commerce Websites [47.04727122209316]
E-commerce queries are often short and ambiguous.
Users tend to enter multiple searches, which we call context, before purchasing.
We propose an end-to-end context-aware query rewriting model.
arXiv Detail & Related papers (2022-09-15T19:46:01Z) - Graph Enhanced BERT for Query Understanding [55.90334539898102]
query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information.
In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks.
We propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph.
arXiv Detail & Related papers (2022-04-03T16:50:30Z) - Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems.
We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z) - Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval [98.62404433761432]
The rapid growth of user-generated videos on the Internet has intensified the need for text-based video retrieval systems.
Traditional methods mainly favor the concept-based paradigm on retrieval with simple queries.
We propose a Tree-augmented Cross-modal.
method by jointly learning the linguistic structure of queries and the temporal representation of videos.
arXiv Detail & Related papers (2020-07-06T02:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.