GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback
- URL: http://arxiv.org/abs/2108.06010v1
- Date: Fri, 13 Aug 2021 01:09:02 GMT
- Title: GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback
- Authors: Minghui Huang, Dong Wang, Shuang Liu, Meizhen Ding
- Abstract summary: We propose a novel approach which effectively integrates text generation models into PRF-based query expansion.
Our approach generates augmented query terms via neural text generation models conditioned on both the initial query and pseudo-relevance feedback.
We evaluate the performance of our approach on information retrieval tasks using two benchmark datasets.
- Score: 8.142861977776256
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Query expansion with pseudo-relevance feedback (PRF) is a powerful approach
to enhance the effectiveness in information retrieval. Recently, with the rapid
advance of deep learning techniques, neural text generation has achieved
promising success in many natural language tasks. To leverage the strength of
text generation for information retrieval, in this article, we propose a novel
approach which effectively integrates text generation models into PRF-based
query expansion. In particular, our approach generates augmented query terms
via neural text generation models conditioned on both the initial query and
pseudo-relevance feedback. Moreover, in order to train the generative model, we
adopt the conditional generative adversarial nets (CGANs) and propose the
PRF-CGAN method in which both the generator and the discriminator are
conditioned on the pseudo-relevance feedback. We evaluate the performance of
our approach on information retrieval tasks using two benchmark datasets. The
experimental results show that our approach achieves comparable performance or
outperforms traditional query expansion methods on both the retrieval and
reranking tasks.
Related papers
- Token-level Proximal Policy Optimization for Query Generation [45.81132350185301]
State-of-the-art query generation methods leverage Large Language Models (LLMs) for their strong capabilities in context understanding and text generation.
We propose Token-level Proximal Policy Optimization (TPPO), a noval approach designed to empower LLMs perform better in query generation through fine-tuning.
TPPO is based on the Reinforcement Learning from AI Feedback (RLAIF) paradigm, consisting of a token-level reward model and a token-level proximal policy optimization module.
arXiv Detail & Related papers (2024-11-01T16:36:14Z) - Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation [72.70046559930555]
We propose a generic RAG approach called Adaptive Note-Enhanced RAG (Adaptive-Note) for complex QA tasks.
Specifically, Adaptive-Note introduces an overarching view of knowledge growth, iteratively gathering new information in the form of notes.
In addition, we employ an adaptive, note-based stop-exploration strategy to decide "what to retrieve and when to stop" to encourage sufficient knowledge exploration.
arXiv Detail & Related papers (2024-10-11T14:03:29Z) - GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval [20.807374287510623]
We propose GenCRF: a Generative Clustering and Reformulation Framework to capture diverse intentions adaptively.
We show that GenCRF achieves state-of-the-art performance, surpassing previous query reformulation SOTAs by up to 12% on nDCG@10.
arXiv Detail & Related papers (2024-09-17T05:59:32Z) - GQE: Generalized Query Expansion for Enhanced Text-Video Retrieval [56.610806615527885]
This paper introduces a novel data-centric approach, Generalized Query Expansion (GQE), to address the inherent information imbalance between text and video.
By adaptively segmenting videos into short clips and employing zero-shot captioning, GQE enriches the training dataset with comprehensive scene descriptions.
GQE achieves state-of-the-art performance on several benchmarks, including MSR-VTT, MSVD, LSMDC, and VATEX.
arXiv Detail & Related papers (2024-08-14T01:24:09Z) - Enhancing Retrieval Processes for Language Generation with Augmented
Queries [0.0]
This research focuses on addressing this issue through Retrieval-Augmented Generation (RAG), a technique that guides models to give accurate responses based on real facts.
To overcome scalability issues, the study explores connecting user queries with sophisticated language models such as BERT and Orca2.
The empirical results indicate a significant improvement in the initial language model's performance under RAG.
arXiv Detail & Related papers (2024-02-06T13:19:53Z) - Evaluating Generative Ad Hoc Information Retrieval [58.800799175084286]
generative retrieval systems often directly return a grounded generated text as a response to a query.
Quantifying the utility of the textual responses is essential for appropriately evaluating such generative ad hoc retrieval.
arXiv Detail & Related papers (2023-11-08T14:05:00Z) - Enhancing Retrieval-Augmented Large Language Models with Iterative
Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner.
A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge.
Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z) - Modified Query Expansion Through Generative Adversarial Networks for
Information Extraction in E-Commerce [1.713291434132985]
This work addresses an alternative approach for query expansion using a generative adversarial network (GAN) to enhance the effectiveness of information search in e-commerce.
We propose a modified QE conditional GAN (mQE-CGAN) framework, which resolves keywords by expanding the query with a synthetically generated query.
Our experiments demonstrate that the utilization of condition structures within the mQE-CGAN framework can increase the semantic similarity between generated sequences and reference documents up to nearly 10%.
arXiv Detail & Related papers (2022-12-30T19:21:44Z) - AugTriever: Unsupervised Dense Retrieval and Domain Adaptation by Scalable Data Augmentation [44.93777271276723]
We propose two approaches that enable annotation-free and scalable training by creating pseudo querydocument pairs.
The query extraction method involves selecting salient spans from the original document to generate pseudo queries.
The transferred query generation method utilizes generation models trained for other NLP tasks, such as summarization, to produce pseudo queries.
arXiv Detail & Related papers (2022-12-17T10:43:25Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - Generation-Augmented Retrieval for Open-domain Question Answering [134.27768711201202]
Generation-Augmented Retrieval (GAR) for answering open-domain questions.
We show that generating diverse contexts for a query is beneficial as fusing their results consistently yields better retrieval accuracy.
GAR achieves state-of-the-art performance on Natural Questions and TriviaQA datasets under the extractive QA setup when equipped with an extractive reader.
arXiv Detail & Related papers (2020-09-17T23:08:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.