Enhancing Retrieval-Augmented Generation with Two-Stage Retrieval: FlashRank Reranking and Query Expansion
- URL: http://arxiv.org/abs/2601.03258v1
- Date: Fri, 17 Oct 2025 15:08:17 GMT
- Title: Enhancing Retrieval-Augmented Generation with Two-Stage Retrieval: FlashRank Reranking and Query Expansion
- Authors: Sherine George,
- Abstract summary: RAG couples a retriever with a large language model (LLM) to ground generated responses in external evidence.<n>We propose a two-stage retrieval pipeline that integrates LLM-driven query expansion to improve candidate recall.<n>FlashRank is a fast marginal-utility reranker that dynamically selects an optimal subset of evidence under a token budget.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-Augmented Generation (RAG) couples a retriever with a large language model (LLM) to ground generated responses in external evidence. While this framework enhances factuality and domain adaptability, it faces a key bottleneck: balancing retrieval recall with limited LLM context. Retrieving too few passages risks missing critical context, while retrieving too many overwhelms the prompt window, diluting relevance and increasing cost. We propose a two-stage retrieval pipeline that integrates LLM-driven query expansion to improve candidate recall and FlashRank, a fast marginal-utility reranker that dynamically selects an optimal subset of evidence under a token budget. FlashRank models document utility as a weighted combination of relevance, novelty, brevity, and cross-encoder evidence. Together, these modules form a generalizable solution that increases answer accuracy, faithfulness, and computational efficiency.
Related papers
- RouteRAG: Efficient Retrieval-Augmented Generation from Text and Graph via Reinforcement Learning [69.87510139069218]
Retrieval-Augmented Generation (RAG) integrates non-parametric knowledge into Large Language Models (LLMs)<n>Recent progress has advanced text-based RAG to multi-turn reasoning through Reinforcement Learning (RL)<n>We introduce model, an RL-based framework that enables LLMs to perform multi-turn and adaptive graph-text hybrid RAG.
arXiv Detail & Related papers (2025-12-10T10:05:31Z) - MARAG-R1: Beyond Single Retriever via Reinforcement-Learned Multi-Tool Agentic Retrieval [50.30107119622642]
Large Language Models (LLMs) excel at reasoning and generation but are inherently limited by static pretraining data.<n>Retrieval-Augmented Generation (RAG) addresses this issue by grounding LLMs in external knowledge.<n>MarAG-R1 is a reinforcement-learned multi-tool RAG framework that enables LLMs to dynamically coordinate multiple retrieval mechanisms.
arXiv Detail & Related papers (2025-10-31T15:51:39Z) - Rethinking On-policy Optimization for Query Augmentation [49.87723664806526]
We present the first systematic comparison of prompting-based and RL-based query augmentation across diverse benchmarks.<n>We introduce a novel hybrid method, On-policy Pseudo-document Query Expansion (OPQE), which learns to generate a pseudo-document that maximizes retrieval performance.
arXiv Detail & Related papers (2025-10-20T04:16:28Z) - Reasoning-enhanced Query Understanding through Decomposition and Interpretation [87.56450566014625]
ReDI is a Reasoning-enhanced approach for query understanding through Decomposition and Interpretation.<n>We compiled a large-scale dataset of real-world complex queries from a major search engine.<n> Experiments on BRIGHT and BEIR demonstrate that ReDI consistently surpasses strong baselines in both sparse and dense retrieval paradigms.
arXiv Detail & Related papers (2025-09-08T10:58:42Z) - MSRS: Evaluating Multi-Source Retrieval-Augmented Generation [51.717139132190574]
Many real-world applications demand the ability to integrate and summarize information scattered across multiple sources.<n>We present a scalable framework for constructing evaluation benchmarks that challenge RAG systems to integrate information across distinct sources.
arXiv Detail & Related papers (2025-08-28T14:59:55Z) - DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval [36.38599923075882]
DIVER is a retrieval pipeline designed for reasoning-intensive information retrieval.<n>It consists of four components: the document preprocessing stage, the query expansion stage, the retrieval stage and the reranking stage.<n>On the BRIGHT benchmark, DIVER achieves state-of-the-art nDCG@10 scores of 45.8 overall and 28.9 on original queries, consistently outperforming competitive reasoning-aware models.
arXiv Detail & Related papers (2025-08-11T13:57:49Z) - Generalized Reinforcement Learning for Retriever-Specific Query Rewriter with Unstructured Real-World Documents [4.200973008100858]
textbfRL-QR is a reinforcement learning framework for retriever-specific query rewriting.<n> RL-QR trains query rewriters tailored to specific retrievers, enhancing retrieval performance across varied domains.<n>Our findings highlight RL-QR's potential to revolutionize query optimization for RAG systems.
arXiv Detail & Related papers (2025-07-31T04:55:21Z) - Pseudo Relevance Feedback is Enough to Close the Gap Between Small and Large Dense Retrieval Models [29.934928091542375]
Scaling dense retrievers to larger large language model (LLM) backbones has been a dominant strategy for improving their retrieval effectiveness.<n>We introduce PromptPRF, a feature-based pseudo-relevance feedback (PRF) framework that enables small LLM-based dense retrievers to achieve effectiveness comparable to much larger models.
arXiv Detail & Related papers (2025-03-19T04:30:20Z) - LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers [24.01783076521377]
Retrieval-Augmented Generation (RAG) is a crucial method for mitigating hallucinations in Large Language Models (LLMs)<n>Existing RAG methods typically employ query rewriting to clarify the user intent and manage multi-hop logic, while using hybrid retrieval to expand search scope.<n>We introduce a high-level searcher that decomposes complex queries into atomic queries, independent of any retriever-specific optimizations.<n>To harness the strengths of sparse retrievers for precise keyword retrieval, we have developed a new sparse searcher that employs Lucene syntax to enhance retrieval accuracy.
arXiv Detail & Related papers (2025-02-25T12:09:16Z) - GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval [20.807374287510623]
We propose GenCRF: a Generative Clustering and Reformulation Framework to capture diverse intentions adaptively.
We show that GenCRF achieves state-of-the-art performance, surpassing previous query reformulation SOTAs by up to 12% on nDCG@10.
arXiv Detail & Related papers (2024-09-17T05:59:32Z) - Self-RAG: Learning to Retrieve, Generate, and Critique through
Self-Reflection [74.51523859064802]
We introduce a new framework called Self-Reflective Retrieval-Augmented Generation (Self-RAG)
Self-RAG enhances an LM's quality and factuality through retrieval and self-reflection.
It significantly outperforms state-of-the-art LLMs and retrieval-augmented models on a diverse set of tasks.
arXiv Detail & Related papers (2023-10-17T18:18:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.