Diversity driven Query Rewriting in Search Advertising
- URL: http://arxiv.org/abs/2106.03816v1
- Date: Mon, 7 Jun 2021 17:30:45 GMT
- Title: Diversity driven Query Rewriting in Search Advertising
- Authors: Akash Kumar Mohankumar, Nikit Begwani, Amit Singh
- Abstract summary: generative retrieval models have been shown to be effective at the task of generating such query rewrites.
We introduce CLOVER, a framework to generate both high-quality and diverse rewrites.
We empirically show the effectiveness of our proposed approach through offline experiments on search queries across geographies spanning three major languages.
- Score: 1.5289756643078838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieving keywords (bidwords) with the same intent as query, referred to as
close variant keywords, is of prime importance for effective targeted search
advertising. For head and torso search queries, sponsored search engines use a
huge repository of same intent queries and keywords, mined ahead of time.
Online, this repository is used to rewrite the query and then lookup the
rewrite in a repository of bid keywords contributing to significant revenue.
Recently generative retrieval models have been shown to be effective at the
task of generating such query rewrites. We observe two main limitations of such
generative models. First, rewrites generated by these models exhibit low
lexical diversity, and hence the rewrites fail to retrieve relevant keywords
that have diverse linguistic variations. Second, there is a misalignment
between the training objective - the likelihood of training data, v/s what we
desire - improved quality and coverage of rewrites. In this work, we introduce
CLOVER, a framework to generate both high-quality and diverse rewrites by
optimizing for human assessment of rewrite quality using our diversity-driven
reinforcement learning algorithm. We use an evaluation model, trained to
predict human judgments, as the reward function to finetune the generation
policy. We empirically show the effectiveness of our proposed approach through
offline experiments on search queries across geographies spanning three major
languages. We also perform online A/B experiments on Bing, a large commercial
search engine, which shows (i) better user engagement with an average increase
in clicks by 12.83% accompanied with an average defect reduction by 13.97%, and
(ii) improved revenue by 21.29%.
Related papers
- Tree Search for Language Model Agents [69.43007235771383]
We propose an inference-time search algorithm for LM agents to perform exploration and multi-step planning in interactive web environments.
Our approach is a form of best-first tree search that operates within the actual environment space.
It is the first tree search algorithm for LM agents that shows effectiveness on realistic web tasks.
arXiv Detail & Related papers (2024-07-01T17:07:55Z) - Think-then-Act: A Dual-Angle Evaluated Retrieval-Augmented Generation [3.2134014920850364]
Large language models (LLMs) often face challenges such as temporal misalignment and generating hallucinatory content.
We propose a dual-angle evaluated retrieval-augmented generation framework textitThink-then-Act'
arXiv Detail & Related papers (2024-06-18T20:51:34Z) - Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback [8.661419320202787]
GenQREnsemble and GenQRFusion leverage paraphrases of a zero-shot instruction to generate multiple sets of keywords to improve retrieval performance.
We demonstrate that an ensemble of query reformulations can improve retrieval effectiveness by up to 18% on nDCG@10 in pre-retrieval settings and 9% on post-retrieval settings.
arXiv Detail & Related papers (2024-05-27T21:03:26Z) - Adapting Dual-encoder Vision-language Models for Paraphrased Retrieval [55.90407811819347]
We consider the task of paraphrased text-to-image retrieval where a model aims to return similar results given a pair of paraphrased queries.
We train a dual-encoder model starting from a language model pretrained on a large text corpus.
Compared to public dual-encoder models such as CLIP and OpenCLIP, the model trained with our best adaptation strategy achieves a significantly higher ranking similarity for paraphrased queries.
arXiv Detail & Related papers (2024-05-06T06:30:17Z) - Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents.
Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z) - Fine-tuning Language Models for Factuality [96.5203774943198]
Large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines.
Yet language models are prone to making convincing but factually inaccurate claims, often referred to as 'hallucinations'
In this work, we fine-tune language models to be more factual, without human labeling.
arXiv Detail & Related papers (2023-11-14T18:59:15Z) - Unified Embedding Based Personalized Retrieval in Etsy Search [0.206242362470764]
We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end.
Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate.
arXiv Detail & Related papers (2023-06-07T23:24:50Z) - Enriching Relation Extraction with OpenIE [70.52564277675056]
Relation extraction (RE) is a sub-discipline of information extraction (IE)
In this work, we explore how recent approaches for open information extraction (OpenIE) may help to improve the task of RE.
Our experiments over two annotated corpora, KnowledgeNet and FewRel, demonstrate the improved accuracy of our enriched models.
arXiv Detail & Related papers (2022-12-19T11:26:23Z) - Query Expansion Using Contextual Clue Sampling with Language Models [69.51976926838232]
We propose a combination of an effective filtering strategy and fusion of the retrieved documents based on the generation probability of each context.
Our lexical matching based approach achieves a similar top-5/top-20 retrieval accuracy and higher top-100 accuracy compared with the well-established dense retrieval model DPR.
For end-to-end QA, the reader model also benefits from our method and achieves the highest Exact-Match score against several competitive baselines.
arXiv Detail & Related papers (2022-10-13T15:18:04Z) - Unified Generative & Dense Retrieval for Query Rewriting in Sponsored
Search [6.181557214852772]
We compare two paradigms for online query rewriting: Generative (NLG) and Dense Retrieval (DR) methods.
We propose CLOVER-Unity, a novel approach that unifies generative and dense retrieval methods in one single model.
arXiv Detail & Related papers (2022-09-13T10:19:23Z) - Leveraging Cognitive Search Patterns to Enhance Automated Natural
Language Retrieval Performance [0.0]
We show that cognitive reformulation patterns that mimic user search behaviour are highlighted.
We formalize the application of these patterns by considering a query conceptual representation.
A genetic algorithm-based weighting process allows placing emphasis on terms according to their conceptual role-type.
arXiv Detail & Related papers (2020-04-21T14:13:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.