ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models
in Sponsored Search Engine
- URL: http://arxiv.org/abs/2010.10789v1
- Date: Wed, 21 Oct 2020 07:03:20 GMT
- Title: ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models
in Sponsored Search Engine
- Authors: Weizhen Qi, Yeyun Gong, Yu Yan, Jian Jiao, Bo Shao, Ruofei Zhang,
Houqiang Li, Nan Duan, Ming Zhou
- Abstract summary: Generative retrieval models generate outputs token by token on a path of the target library prefix tree (Trie)
We analyze these problems and propose a looking ahead strategy for generative retrieval models named ProphetNet-Ads.
Compared with Trie-based LSTM generative retrieval model proposed recently, our single model result and integrated result improve the recall by 15.58% and 18.8% respectively with beam size 5.
- Score: 123.65646903493614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a sponsored search engine, generative retrieval models are recently
proposed to mine relevant advertisement keywords for users' input queries.
Generative retrieval models generate outputs token by token on a path of the
target library prefix tree (Trie), which guarantees all of the generated
outputs are legal and covered by the target library. In actual use, we found
several typical problems caused by Trie-constrained searching length. In this
paper, we analyze these problems and propose a looking ahead strategy for
generative retrieval models named ProphetNet-Ads. ProphetNet-Ads improves the
retrieval ability by directly optimizing the Trie-constrained searching space.
We build a dataset from a real-word sponsored search engine and carry out
experiments to analyze different generative retrieval models. Compared with
Trie-based LSTM generative retrieval model proposed recently, our single model
result and integrated result improve the recall by 15.58\% and 18.8\%
respectively with beam size 5. Case studies further demonstrate how these
problems are alleviated by ProphetNet-Ads clearly.
Related papers
- Learning to Rank for Multiple Retrieval-Augmented Models through Iterative Utility Maximization [21.115495457454365]
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents.
We introduce an iterative approach where the search engine generates retrieval results for these RAG agents and gathers feedback on the quality of the retrieved documents during an offline phase.
We adapt this approach to an online setting, allowing the search engine to refine its behavior based on real-time individual agents feedback.
arXiv Detail & Related papers (2024-10-13T17:53:50Z) - Generative Retrieval with Preference Optimization for E-commerce Search [16.78829577915103]
We develop an innovative framework for E-commerce search, called generative retrieval with preference optimization.
We employ multi-span identifiers to represent raw item titles and transform the task of generating titles from queries into the task of generating multi-span identifiers from queries.
Our experiments show that this framework achieves competitive performance on a real-world dataset, and online A/B tests demonstrate the superiority and effectiveness in improving conversion gains.
arXiv Detail & Related papers (2024-07-29T09:31:19Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Large Search Model: Redefining Search Stack in the Era of LLMs [63.503320030117145]
We introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM)
All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts.
This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack.
arXiv Detail & Related papers (2023-10-23T05:52:09Z) - On the Robustness of Generative Retrieval Models: An Out-of-Distribution
Perspective [65.16259505602807]
We study the robustness of generative retrieval models against dense retrieval models.
The empirical results indicate that the OOD robustness of generative retrieval models requires enhancement.
arXiv Detail & Related papers (2023-06-22T09:18:52Z) - Improving Content Retrievability in Search with Controllable Query
Generation [5.450798147045502]
Machine-learned search engines have a high retrievability bias, where the majority of the queries return the same entities.
We propose CtrlQGen, a method that generates queries for a chosen underlying intent-narrow or broad.
Our results on datasets from the domains of music, podcasts, and books reveal that we can significantly decrease the retrievability bias of a dense retrieval model.
arXiv Detail & Related papers (2023-03-21T07:46:57Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - Enabling arbitrary translation objectives with Adaptive Tree Search [23.40984370716434]
We introduce an adaptive tree search algorithm that can find high-scoring outputs under translation models that make no assumptions about the form or structure of the search objective.
Our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models.
arXiv Detail & Related papers (2022-02-23T11:48:26Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.