PreQRAG -- Classify and Rewrite for Enhanced RAG
- URL: http://arxiv.org/abs/2506.17493v1
- Date: Fri, 20 Jun 2025 22:02:05 GMT
- Title: PreQRAG -- Classify and Rewrite for Enhanced RAG
- Authors: Damian Martinez, Catalina Riano, Hui Fang,
- Abstract summary: We introduce PreQRAG, a Retrieval Augmented Generation architecture designed to improve retrieval and generation quality.<n>PreQRAG incorporates a pipeline that first classifies each input question as either single-document or multi-document type.<n>For single-document questions, we employ question rewriting techniques to improve retrieval precision and generation relevance.<n>For multi-document questions, we decompose complex queries into focused sub-questions that can be processed more effectively.
- Score: 1.652907918484303
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper presents the submission of the UDInfo team to the SIGIR 2025 LiveRAG Challenge. We introduce PreQRAG, a Retrieval Augmented Generation (RAG) architecture designed to improve retrieval and generation quality through targeted question preprocessing. PreQRAG incorporates a pipeline that first classifies each input question as either single-document or multi-document type. For single-document questions, we employ question rewriting techniques to improve retrieval precision and generation relevance. For multi-document questions, we decompose complex queries into focused sub-questions that can be processed more effectively by downstream components. This classification and rewriting strategy improves the RAG performance. Experimental evaluation of the LiveRAG Challenge dataset demonstrates the effectiveness of our question-type-aware architecture, with PreQRAG achieving the preliminary second place in Session 2 of the LiveRAG challenge.
Related papers
- UiS-IAI@LiveRAG: Retrieval-Augmented Information Nugget-Based Generation of Responses [11.798121559820792]
Retrieval-augmented generation (RAG) faces challenges related to factual correctness, source attribution, and response completeness.<n>We propose a modular pipeline that operates on information nuggets-minimal, atomic units of relevant information extracted from retrieved documents.
arXiv Detail & Related papers (2025-06-27T13:29:25Z) - RMIT-ADM+S at the SIGIR 2025 LiveRAG Challenge [4.364909807482374]
This paper presents the RMIT--ADM+S winning system in the SIGIR 2025 LiveRAG Challenge.<n>Our Generation-Retrieval-Augmented Generation (G-RAG) approach generates a hypothetical answer that is used during the retrieval phase, alongside the original question.<n>G-RAG also incorporates a pointwise large language model (LLM)-based re-ranking step prior to final answer generation.
arXiv Detail & Related papers (2025-06-17T13:41:12Z) - Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps [16.84310001807895]
This paper introduces a model-agnostic approach that can be applied to A-RAG methods.<n>Specifically, we use cache access and parallel generation to speed up the prefilling and decoding stages respectively.
arXiv Detail & Related papers (2025-05-19T05:39:38Z) - mmRAG: A Modular Benchmark for Retrieval-Augmented Generation over Text, Tables, and Knowledge Graphs [11.861763118322136]
We introduce mmRAG, a modular benchmark for evaluating multi-modal RAG systems.<n>Our benchmark integrates queries from six diverse question-answering datasets spanning text, tables, and knowledge graphs.<n>We follow standard information retrieval procedures to annotate document relevance and derive dataset relevance.
arXiv Detail & Related papers (2025-05-16T12:31:29Z) - Typed-RAG: Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation [1.275764996205493]
We propose Typed-RAG, a framework for type-aware decomposition of non-factoid questions (NFQs)<n>Specifically, Typed-RAG first classifies an NFQ into a predefined type.<n>It then decomposes the question into focused sub-queries, each focusing on a single aspect.<n>By combining the results of these sub-queries, Typed-RAG produces more informative and contextually aligned responses.
arXiv Detail & Related papers (2025-03-20T06:04:12Z) - Chain-of-Retrieval Augmented Generation [72.06205327186069]
This paper introduces an approach for training o1-like RAG models that retrieve and reason over relevant information step by step before generating the final answer.<n>Our proposed method, CoRAG, allows the model to dynamically reformulate the query based on the evolving state.
arXiv Detail & Related papers (2025-01-24T09:12:52Z) - Do RAG Systems Cover What Matters? Evaluating and Optimizing Responses with Sub-Question Coverage [74.70255719194819]
We introduce a novel framework based on sub-question coverage, which measures how well a RAG system addresses different facets of a question.
We use this framework to evaluate three commercial generative answer engines: You.com, Perplexity AI, and Bing Chat.
We find that while all answer engines cover core sub-questions more often than background or follow-up ones, they still miss around 50% of core sub-questions.
arXiv Detail & Related papers (2024-10-20T22:59:34Z) - MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation [60.04380907045708]
Retrieval-Augmented Generation (RAG) is considered a promising strategy to address this problem.<n>We propose MemoRAG, a novel RAG framework empowered by global memory-augmented retrieval.<n>MemoRAG achieves superior performances across a variety of long-context evaluation tasks.
arXiv Detail & Related papers (2024-09-09T13:20:31Z) - RaFe: Ranking Feedback Improves Query Rewriting for RAG [83.24385658573198]
We propose a framework for training query rewriting models free of annotations.
By leveraging a publicly available reranker, oursprovides feedback aligned well with the rewriting objectives.
arXiv Detail & Related papers (2024-05-23T11:00:19Z) - DuetRAG: Collaborative Retrieval-Augmented Generation [57.440772556318926]
Collaborative Retrieval-Augmented Generation framework, DuetRAG, proposed.
bootstrapping philosophy is to simultaneously integrate the domain fintuning and RAG models.
arXiv Detail & Related papers (2024-05-12T09:48:28Z) - RQ-RAG: Learning to Refine Queries for Retrieval Augmented Generation [42.82192656794179]
Large Language Models (LLMs) exhibit remarkable capabilities but are prone to generating inaccurate or hallucinatory responses.
This limitation stems from their reliance on vast pretraining datasets, making them susceptible to errors in unseen scenarios.
Retrieval-Augmented Generation (RAG) addresses this by incorporating external, relevant documents into the response generation process.
arXiv Detail & Related papers (2024-03-31T08:58:54Z) - CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources.
This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z) - EQG-RACE: Examination-Type Question Generation [21.17100754955864]
We propose an innovative Examination-type Question Generation approach (EQG-RACE) to generate exam-like questions based on a dataset extracted from RACE.
Two main strategies are employed in EQG-RACE for dealing with discrete answer information and reasoning among long contexts.
Experimental results show a state-of-the-art performance of EQG-RACE, which is apparently superior to the baselines.
arXiv Detail & Related papers (2020-12-11T03:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.