Related papers: GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback

GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback

URL: http://arxiv.org/abs/2108.06010v1
Date: Fri, 13 Aug 2021 01:09:02 GMT
Title: GQE-PRF: Generative Query Expansion with Pseudo-Relevance Feedback
Authors: Minghui Huang, Dong Wang, Shuang Liu, Meizhen Ding
Abstract summary: We propose a novel approach which effectively integrates text generation models into PRF-based query expansion. Our approach generates augmented query terms via neural text generation models conditioned on both the initial query and pseudo-relevance feedback. We evaluate the performance of our approach on information retrieval tasks using two benchmark datasets.
Score: 8.142861977776256
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Query expansion with pseudo-relevance feedback (PRF) is a powerful approach to enhance the effectiveness in information retrieval. Recently, with the rapid advance of deep learning techniques, neural text generation has achieved promising success in many natural language tasks. To leverage the strength of text generation for information retrieval, in this article, we propose a novel approach which effectively integrates text generation models into PRF-based query expansion. In particular, our approach generates augmented query terms via neural text generation models conditioned on both the initial query and pseudo-relevance feedback. Moreover, in order to train the generative model, we adopt the conditional generative adversarial nets (CGANs) and propose the PRF-CGAN method in which both the generator and the discriminator are conditioned on the pseudo-relevance feedback. We evaluate the performance of our approach on information retrieval tasks using two benchmark datasets. The experimental results show that our approach achieves comparable performance or outperforms traditional query expansion methods on both the retrieval and reranking tasks.

Related papers

SemCORE: A Semantic-Enhanced Generative Cross-Modal Retrieval Framework with MLLMs [70.79124435220695]
We propose a novel unified Semantic-enhanced generative Cross-mOdal REtrieval framework (SemCORE) We first construct a Structured natural language IDentifier (SID) that effectively aligns target identifiers with generative models optimized for natural language comprehension and generation. We then introduce a Generative Semantic Verification (GSV) strategy enabling fine-grained target discrimination.
arXiv Detail & Related papers (2025-04-17T17:59:27Z)
Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z)
Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer. Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z)
Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance. We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z)
Token-level Proximal Policy Optimization for Query Generation [45.81132350185301]
State-of-the-art query generation methods leverage Large Language Models (LLMs) for their strong capabilities in context understanding and text generation. We propose Token-level Proximal Policy Optimization (TPPO), a noval approach designed to empower LLMs perform better in query generation through fine-tuning. TPPO is based on the Reinforcement Learning from AI Feedback (RLAIF) paradigm, consisting of a token-level reward model and a token-level proximal policy optimization module.
arXiv Detail & Related papers (2024-11-01T16:36:14Z)
Retriever-and-Memory: Towards Adaptive Note-Enhanced Retrieval-Augmented Generation [72.70046559930555]
We propose a generic RAG approach called Adaptive Note-Enhanced RAG (Adaptive-Note) for complex QA tasks. Specifically, Adaptive-Note introduces an overarching view of knowledge growth, iteratively gathering new information in the form of notes. In addition, we employ an adaptive, note-based stop-exploration strategy to decide "what to retrieve and when to stop" to encourage sufficient knowledge exploration.
arXiv Detail & Related papers (2024-10-11T14:03:29Z)
GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval [20.807374287510623]
We propose GenCRF: a Generative Clustering and Reformulation Framework to capture diverse intentions adaptively. We show that GenCRF achieves state-of-the-art performance, surpassing previous query reformulation SOTAs by up to 12% on nDCG@10.
arXiv Detail & Related papers (2024-09-17T05:59:32Z)
ACE: A Generative Cross-Modal Retrieval Framework with Coarse-To-Fine Semantic Modeling [53.97609687516371]
We propose a pioneering generAtive Cross-modal rEtrieval framework (ACE) for end-to-end cross-modal retrieval. ACE achieves state-of-the-art performance in cross-modal retrieval and outperforms the strong baselines on Recall@1 by 15.27% on average.
arXiv Detail & Related papers (2024-06-25T12:47:04Z)
Enhancing Retrieval Processes for Language Generation with Augmented Queries [0.0]
This research focuses on addressing this issue through Retrieval-Augmented Generation (RAG), a technique that guides models to give accurate responses based on real facts. To overcome scalability issues, the study explores connecting user queries with sophisticated language models such as BERT and Orca2. The empirical results indicate a significant improvement in the initial language model's performance under RAG.
arXiv Detail & Related papers (2024-02-06T13:19:53Z)
Evaluating Generative Ad Hoc Information Retrieval [58.800799175084286]
generative retrieval systems often directly return a grounded generated text as a response to a query. Quantifying the utility of the textual responses is essential for appropriately evaluating such generative ad hoc retrieval.
arXiv Detail & Related papers (2023-11-08T14:05:00Z)
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy [164.83371924650294]
We show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner. A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge. Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints.
arXiv Detail & Related papers (2023-05-24T16:17:36Z)
Modified Query Expansion Through Generative Adversarial Networks for Information Extraction in E-Commerce [1.713291434132985]
This work addresses an alternative approach for query expansion using a generative adversarial network (GAN) to enhance the effectiveness of information search in e-commerce. We propose a modified QE conditional GAN (mQE-CGAN) framework, which resolves keywords by expanding the query with a synthetically generated query. Our experiments demonstrate that the utilization of condition structures within the mQE-CGAN framework can increase the semantic similarity between generated sequences and reference documents up to nearly 10%.
arXiv Detail & Related papers (2022-12-30T19:21:44Z)
AugTriever: Unsupervised Dense Retrieval and Domain Adaptation by Scalable Data Augmentation [44.93777271276723]
We propose two approaches that enable annotation-free and scalable training by creating pseudo querydocument pairs. The query extraction method involves selecting salient spans from the original document to generate pseudo queries. The transferred query generation method utilizes generation models trained for other NLP tasks, such as summarization, to produce pseudo queries.
arXiv Detail & Related papers (2022-12-17T10:43:25Z)
Generation-Augmented Retrieval for Open-domain Question Answering [134.27768711201202]
Generation-Augmented Retrieval (GAR) for answering open-domain questions. We show that generating diverse contexts for a query is beneficial as fusing their results consistently yields better retrieval accuracy. GAR achieves state-of-the-art performance on Natural Questions and TriviaQA datasets under the extractive QA setup when equipped with an extractive reader.
arXiv Detail & Related papers (2020-09-17T23:08:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.