FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG
- URL: http://arxiv.org/abs/2410.10293v1
- Date: Mon, 14 Oct 2024 08:47:21 GMT
- Title: FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG
- Authors: Xinping Zhao, Yan Zhong, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Dongfang Li, Baotian Hu, Min Zhang,
- Abstract summary: Retrieval-Augmented Generation (RAG) prevails in Large Language Models.
We propose a progressive retrieval paradigm with coarse-to-fine granularity for RAG.
- Score: 22.4664221738095
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-Augmented Generation (RAG) prevails in Large Language Models. It mainly consists of retrieval and generation. The retrieval modules (a.k.a. retrievers) aim to find useful information used to facilitate generation modules (a.k.a. generators). As such, generators' performance largely depends on the effectiveness and efficiency of retrievers. However, the retrieval paradigm that we design and use remains flat, which treats the retrieval procedures as a one-off deal with constant granularity. Despite effectiveness, we argue that they suffer from two limitations: (1) flat retrieval exerts a significant burden on one retriever; (2) constant granularity limits the ceiling of retrieval performance. In this work, we propose a progressive retrieval paradigm with coarse-to-fine granularity for RAG, termed FunnelRAG, so as to balance effectiveness and efficiency. Specifically, FunnelRAG establishes a progressive retrieval pipeline by collaborating coarse-to-fine granularity, large-to-small quantity, and low-to-high capacity, which can relieve the burden on one retriever and also promote the ceiling of retrieval performance. Extensive experiments manifest that FunnelRAG achieves comparable retrieval performance while the time overhead is reduced by nearly 40 percent.
Related papers
- DeepRAG: Thinking to Retrieval Step by Step for Large Language Models [92.87532210660456]
We propose DeepRAG, a framework that models retrieval-augmented reasoning as a Markov Decision Process (MDP)
By iteratively decomposing queries, DeepRAG dynamically determines whether to retrieve external knowledge or rely on parametric reasoning at each step.
Experiments show that DeepRAG improves retrieval efficiency while improving answer accuracy by 21.99%, demonstrating its effectiveness in optimizing retrieval-augmented reasoning.
arXiv Detail & Related papers (2025-02-03T08:22:45Z) - Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.
Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - MST-R: Multi-Stage Tuning for Retrieval Systems and Metric Evaluation [7.552430488883876]
We present a system that adapts the retriever performance to the target domain using a multi-stage tuning strategy.
We benchmark the system performance on the dataset released for the RIRAG challenge.
We achieve significant performance gains obtaining a top rank on the RegNLP challenge leaderboard.
arXiv Detail & Related papers (2024-12-13T17:53:29Z) - Toward Optimal Search and Retrieval for RAG [39.69494982983534]
Retrieval-augmented generation (RAG) is a promising method for addressing some of the memory-related challenges associated with Large Language Models (LLMs)
Here, we work towards the goal of understanding how retrievers can be optimized for RAG pipelines for common tasks such as Question Answering (QA)
arXiv Detail & Related papers (2024-11-11T22:06:51Z) - Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers [6.773411876899064]
inference-free sparse models lag far behind in terms of search relevance when compared to both sparse and dense siamese models.
We propose two different approaches for performance improvement. First, we introduce the IDF-aware FLOPS loss, which introduces Inverted Document Frequency (IDF) to the sparsification of representations.
We find that it mitigates the negative impact of the FLOPS regularization on search relevance, allowing the model to achieve a better balance between accuracy and efficiency.
arXiv Detail & Related papers (2024-11-07T03:46:43Z) - Exploring Demonstration Retrievers in RAG for Coding Tasks: Yeas and Nays! [6.34946724864899]
This paper systematically evaluates the efficiency-effectiveness trade-off of retrievers across three coding tasks.
We show that while BM25 excels in effectiveness, it suffers in efficiency as the knowledge base grows beyond 1000 entries.
In large-scale retrieval, efficiency differences become more pronounced, with approximate dense retrievers offering the greatest gains.
arXiv Detail & Related papers (2024-10-12T22:31:01Z) - EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.64500643247252]
We introduce EfficientRAG, an efficient retriever for multi-hop question answering.
Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
arXiv Detail & Related papers (2024-08-08T06:57:49Z) - ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval.
We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z) - LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text
Retrieval [55.097573036580066]
Experimental results show that LaPraDoR achieves state-of-the-art performance compared with supervised dense retrieval models.
Compared to re-ranking, our lexicon-enhanced approach can be run in milliseconds (22.5x faster) while achieving superior performance.
arXiv Detail & Related papers (2022-03-11T18:53:12Z) - Adversarial Retriever-Ranker for dense text retrieval [51.87158529880056]
We present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.
AR2 consistently and significantly outperforms existing dense retriever methods.
This includes the improvements on Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and MS-MARCO MRR@10 to 39.5%(+1.3%)
arXiv Detail & Related papers (2021-10-07T16:41:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.