CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability
- URL: http://arxiv.org/abs/2505.10063v1
- Date: Thu, 15 May 2025 08:05:12 GMT
- Title: CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability
- Authors: Han Peng, Jinhao Jiang, Zican Dong, Wayne Xin Zhao, Lei Fang,
- Abstract summary: We introduce $textbfCAFE$, a two-stage coarse-to-fine method to enhance multi-document question-answering capacities.<n>CAFE achieves up to 22.1% and 13.7% SubEM improvement over SFT and RAG methods on the Mistral model, respectively.
- Score: 55.46506909726119
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Advancements in Large Language Models (LLMs) have extended their input context length, yet they still struggle with retrieval and reasoning in long-context inputs. Existing methods propose to utilize the prompt strategy and retrieval head to alleviate this limitation. However, they still face challenges in balancing retrieval precision and recall, impacting their efficacy in answering questions. To address this, we introduce $\textbf{CAFE}$, a two-stage coarse-to-fine method to enhance multi-document question-answering capacities. By gradually eliminating the negative impacts of background and distracting documents, CAFE makes the responses more reliant on the evidence documents. Initially, a coarse-grained filtering method leverages retrieval heads to identify and rank relevant documents. Then, a fine-grained steering method guides attention to the most relevant content. Experiments across benchmarks show CAFE outperforms baselines, achieving up to 22.1% and 13.7% SubEM improvement over SFT and RAG methods on the Mistral model, respectively.
Related papers
- FrugalRAG: Learning to retrieve and reason for multi-hop QA [10.193015391271535]
Large-scale fine-tuning is not needed to improve RAG metrics.<n>Supervised and RL-based fine-tuning can help RAG from the perspective of frugality.
arXiv Detail & Related papers (2025-07-10T11:02:13Z) - ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge [49.65993318863458]
ImpliRet is a benchmark that shifts the reasoning challenge to document-side processing.<n>We evaluate a range of sparse and dense retrievers, all of which struggle in this setting.
arXiv Detail & Related papers (2025-06-17T11:08:29Z) - Document Attribution: Examining Citation Relationships using Large Language Models [62.46146670035751]
We propose a zero-shot approach that frames attribution as a straightforward textual entailment task.<n>We also explore the role of the attention mechanism in enhancing the attribution process.
arXiv Detail & Related papers (2025-05-09T04:40:11Z) - SUNAR: Semantic Uncertainty based Neighborhood Aware Retrieval for Complex QA [2.7703990035016868]
We introduce SUNAR, a novel approach that leverages large language models to guide a Neighborhood Aware Retrieval process.<n>We validate our approach through extensive experiments on two complex QA datasets.<n>Our results show that SUNAR significantly outperforms existing retrieve-and-reason baselines, achieving up to a 31.84% improvement in performance.
arXiv Detail & Related papers (2025-03-23T08:50:44Z) - Vietnamese Legal Information Retrieval in Question-Answering System [0.0]
Retrieval Augmented Generation (RAG) has gained significant recognition for enhancing the capabilities of large language models (LLMs)
However, RAG often fall short when applied to the Vietnamese language due to several challenges.
This report introduces our three main modifications taken to address these challenges.
arXiv Detail & Related papers (2024-09-05T02:34:05Z) - QPaug: Question and Passage Augmentation for Open-Domain Question Answering of LLMs [5.09189220106765]
We propose a simple yet efficient method called question and passage augmentation (QPaug) via large language models (LLMs) for open-domain question-answering tasks.
Experimental results show that QPaug outperforms the previous state-of-the-art and achieves significant performance gain over existing RAG methods.
arXiv Detail & Related papers (2024-06-20T12:59:27Z) - Read and Think: An Efficient Step-wise Multimodal Language Model for Document Understanding and Reasoning [0.0]
Existing document understanding models tend to generate answers with a single word or phrase directly.
We use Multi-modal Large Language Models (MLLMs) to generate step-wise question-and-answer pairs for document images.
We then use the generated high-quality data to train a humanized document understanding and reasoning model, dubbed DocAssistant.
arXiv Detail & Related papers (2024-02-26T01:17:50Z) - Strong and Efficient Baselines for Open Domain Conversational Question
Answering [2.773656427800412]
We study the State-of-the-Art (SotA) Dense Passage Retrieval (DPR) retriever and Fusion-in-Decoder (FiD) reader pipeline.
We propose and evaluate strong yet simple and efficient baselines, by introducing a fast reranking component between the retriever and the reader.
Experiments on two ODConvQA tasks, namely TopiOCQA and OR-QuAC, show that our method improves the SotA results, while reducing reader's latency by 60%.
arXiv Detail & Related papers (2023-10-23T08:48:14Z) - Noise-Robust Dense Retrieval via Contrastive Alignment Post Training [89.29256833403167]
Contrastive Alignment POst Training (CAPOT) is a highly efficient finetuning method that improves model robustness without requiring index regeneration.
CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root.
We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
arXiv Detail & Related papers (2023-04-06T22:16:53Z) - Evidentiality-aware Retrieval for Overcoming Abstractiveness in
Open-Domain Question Answering [29.00167886463793]
We propose Evidentiality-Aware Passage Retrieval (EADPR) to learn to discriminate evidence passages from distractors.
We conduct extensive experiments to validate the effectiveness of our proposed method on multiple abstractive ODQA tasks.
arXiv Detail & Related papers (2023-04-06T12:42:37Z) - Augmenting Document Representations for Dense Retrieval with
Interpolation and Perturbation [49.940525611640346]
Document Augmentation for dense Retrieval (DAR) framework augments the representations of documents with their Dense Augmentation and perturbations.
We validate the performance of DAR on retrieval tasks with two benchmark datasets, showing that the proposed DAR significantly outperforms relevant baselines on the dense retrieval of both the labeled and unlabeled documents.
arXiv Detail & Related papers (2022-03-15T09:07:38Z) - Pre-training Tasks for Embedding-based Large-scale Retrieval [68.01167604281578]
We consider the large-scale query-document retrieval problem.
Given a query (e.g., a question), return the set of relevant documents from a large document corpus.
We show that the key ingredient of learning a strong embedding-based Transformer model is the set of pre-training tasks.
arXiv Detail & Related papers (2020-02-10T16:44:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.