ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models
in Sponsored Search Engine
- URL: http://arxiv.org/abs/2010.10789v1
- Date: Wed, 21 Oct 2020 07:03:20 GMT
- Title: ProphetNet-Ads: A Looking Ahead Strategy for Generative Retrieval Models
in Sponsored Search Engine
- Authors: Weizhen Qi, Yeyun Gong, Yu Yan, Jian Jiao, Bo Shao, Ruofei Zhang,
Houqiang Li, Nan Duan, Ming Zhou
- Abstract summary: Generative retrieval models generate outputs token by token on a path of the target library prefix tree (Trie)
We analyze these problems and propose a looking ahead strategy for generative retrieval models named ProphetNet-Ads.
Compared with Trie-based LSTM generative retrieval model proposed recently, our single model result and integrated result improve the recall by 15.58% and 18.8% respectively with beam size 5.
- Score: 123.65646903493614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a sponsored search engine, generative retrieval models are recently
proposed to mine relevant advertisement keywords for users' input queries.
Generative retrieval models generate outputs token by token on a path of the
target library prefix tree (Trie), which guarantees all of the generated
outputs are legal and covered by the target library. In actual use, we found
several typical problems caused by Trie-constrained searching length. In this
paper, we analyze these problems and propose a looking ahead strategy for
generative retrieval models named ProphetNet-Ads. ProphetNet-Ads improves the
retrieval ability by directly optimizing the Trie-constrained searching space.
We build a dataset from a real-word sponsored search engine and carry out
experiments to analyze different generative retrieval models. Compared with
Trie-based LSTM generative retrieval model proposed recently, our single model
result and integrated result improve the recall by 15.58\% and 18.8\%
respectively with beam size 5. Case studies further demonstrate how these
problems are alleviated by ProphetNet-Ads clearly.
Related papers
- Enhanced Facet Generation with LLM Editing [5.4327243200369555]
In information retrieval, facet identification of a user query is an important task.
Previous studies can enhance facet prediction by leveraging retrieved documents and related queries obtained through a search engine.
However, there are challenges in extending it to other applications when a search engine operates as part of the model.
arXiv Detail & Related papers (2024-03-25T00:43:44Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Large Search Model: Redefining Search Stack in the Era of LLMs [63.503320030117145]
We introduce a novel conceptual framework called large search model, which redefines the conventional search stack by unifying search tasks with one large language model (LLM)
All tasks are formulated as autoregressive text generation problems, allowing for the customization of tasks through the use of natural language prompts.
This proposed framework capitalizes on the strong language understanding and reasoning capabilities of LLMs, offering the potential to enhance search result quality while simultaneously simplifying the existing cumbersome search stack.
arXiv Detail & Related papers (2023-10-23T05:52:09Z) - On the Robustness of Generative Retrieval Models: An Out-of-Distribution
Perspective [65.16259505602807]
We study the robustness of generative retrieval models against dense retrieval models.
The empirical results indicate that the OOD robustness of generative retrieval models requires enhancement.
arXiv Detail & Related papers (2023-06-22T09:18:52Z) - Improving Content Retrievability in Search with Controllable Query
Generation [5.450798147045502]
Machine-learned search engines have a high retrievability bias, where the majority of the queries return the same entities.
We propose CtrlQGen, a method that generates queries for a chosen underlying intent-narrow or broad.
Our results on datasets from the domains of music, podcasts, and books reveal that we can significantly decrease the retrievability bias of a dense retrieval model.
arXiv Detail & Related papers (2023-03-21T07:46:57Z) - Content-Based Search for Deep Generative Models [45.322081206025544]
We introduce the task of content-based model search: given a query and a large set of generative models, finding the models that best match the query.
As each generative model produces a distribution of images, we formulate the search task as an optimization problem to select the model with the highest probability of generating similar content as the query.
We demonstrate that our method outperforms several baselines on Generative Model Zoo, a new benchmark we create for the model retrieval task.
arXiv Detail & Related papers (2022-10-06T17:59:51Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z) - Enabling arbitrary translation objectives with Adaptive Tree Search [23.40984370716434]
We introduce an adaptive tree search algorithm that can find high-scoring outputs under translation models that make no assumptions about the form or structure of the search objective.
Our algorithm has different biases than beam search has, which enables a new analysis of the role of decoding bias in autoregressive models.
arXiv Detail & Related papers (2022-02-23T11:48:26Z) - NASE: Learning Knowledge Graph Embedding for Link Prediction via Neural
Architecture Search [9.634626241415916]
Link prediction is the task of predicting missing connections between entities in the knowledge graph (KG)
Previous work has tried to use Automated Machine Learning (AutoML) to search for the best model for a given dataset.
We propose a novel Neural Architecture Search (NAS) framework for the link prediction task.
arXiv Detail & Related papers (2020-08-18T03:34:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.