Related papers: Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering

URL: http://arxiv.org/abs/2305.17080v1
Date: Fri, 26 May 2023 16:41:03 GMT
Title: Expand, Rerank, and Retrieve: Query Reranking for Open-Domain Question Answering
Authors: Yung-Sung Chuang, Wei Fang, Shang-Wen Li, Wen-tau Yih, James Glass
Abstract summary: EAR first applies a query expansion model to generate a diverse set of queries, and then uses a query reranker to select the ones that could lead to better retrieval results. By connecting better the query expansion model and retriever, EAR significantly enhances a traditional sparse retrieval method, BM25.
Score: 28.05138829730091
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose EAR, a query Expansion And Reranking approach for improving passage retrieval, with the application to open-domain question answering. EAR first applies a query expansion model to generate a diverse set of queries, and then uses a query reranker to select the ones that could lead to better retrieval results. Motivated by the observation that the best query expansion often is not picked by greedy decoding, EAR trains its reranker to predict the rank orders of the gold passages when issuing the expanded queries to a given retriever. By connecting better the query expansion model and retriever, EAR significantly enhances a traditional sparse retrieval method, BM25. Empirically, EAR improves top-5/20 accuracy by 3-8 and 5-10 points in in-domain and out-of-domain settings, respectively, when compared to a vanilla query expansion model, GAR, and a dense retrieval model, DPR.

Related papers

LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers [24.01783076521377]
Retrieval-Augmented Generation (RAG) is a crucial method for mitigating hallucinations in Large Language Models (LLMs) Existing RAG methods typically employ query rewriting to clarify the user intent and manage multi-hop logic, while using hybrid retrieval to expand search scope. We introduce a high-level searcher that decomposes complex queries into atomic queries, independent of any retriever-specific optimizations. To harness the strengths of sparse retrievers for precise keyword retrieval, we have developed a new sparse searcher that employs Lucene syntax to enhance retrieval accuracy.
arXiv Detail & Related papers (2025-02-25T12:09:16Z)
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval [54.54576644403115]
Many complex real-world queries require in-depth reasoning to identify relevant documents. We introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents. Our dataset consists of 1,384 real-world queries spanning diverse domains, such as economics, psychology, mathematics, and coding.
arXiv Detail & Related papers (2024-07-16T17:58:27Z)
Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu) DAQu augments the original query with various (query-related) metadata across multiple tables. We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z)
Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers [66.55612528039894]
AdaQR is a framework for training query rewriting models with limited rewrite annotations from seed datasets and completely no passage label. A novel approach is proposed to assess retriever's preference for these candidates by the probability of answers conditioned on the conversational query.
arXiv Detail & Related papers (2024-06-16T16:09:05Z)
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search [25.16282868262589]
RetPO is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems. We construct a large-scale dataset called Retrievers' Feedback on over 410K query rewrites across 12K conversations. The resulting model achieves state-of-the-art performance on two recent conversational search benchmarks.
arXiv Detail & Related papers (2024-02-19T04:41:31Z)
Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers? [72.42500059688396]
We show that it is possible to improve the generalization of a strong neural ranker, by prompt engineering and aggregating the ranking results of each expanded query via fusion. Experiments on BEIR and TREC Deep Learning show that the nDCG@10 scores of both MonoT5 and RankT5 following these steps are improved.
arXiv Detail & Related papers (2023-11-15T18:11:41Z)
ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval. We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z)
Decoding a Neural Retriever's Latent Space for Query Suggestion [28.410064376447718]
We show that it is possible to decode a meaningful query from its latent representation and, when moving in the right direction in latent space, to decode a query that retrieves the relevant paragraph. We employ the query decoder to generate a large synthetic dataset of query reformulations for MSMarco. On this data, we train a pseudo-relevance feedback (PRF) T5 model for the application of query suggestion.
arXiv Detail & Related papers (2022-10-21T16:19:31Z)
Query Expansion and Entity Weighting for Query Reformulation Retrieval in Voice Assistant Systems [6.590172620606211]
Voice assistants such as Alexa, Siri, and Google Assistant have become increasingly popular worldwide. linguistic variations, variability of speech patterns, ambient acoustic conditions, and other such factors are often correlated with the assistants misinterpreting the user's query. Retrieval based query reformulation (QR) systems are widely used to reformulate those misinterpreted user queries.
arXiv Detail & Related papers (2022-02-22T23:03:29Z)
Learning Query Expansion over the Nearest Neighbor Graph [94.80212602202518]
Graph Query Expansion (GQE) is presented, which is learned in a supervised manner and performs aggregation over an extended neighborhood of the query. The technique achieves state-of-the-art results over known benchmarks.
arXiv Detail & Related papers (2021-12-05T19:48:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.