Towards Understanding Retrieval Accuracy and Prompt Quality in RAG Systems
- URL: http://arxiv.org/abs/2411.19463v1
- Date: Fri, 29 Nov 2024 04:25:31 GMT
- Title: Towards Understanding Retrieval Accuracy and Prompt Quality in RAG Systems
- Authors: Shengming Zhao, Yuheng Huang, Jiayang Song, Zhijie Wang, Chengcheng Wan, Lei Ma,
- Abstract summary: We conduct an early exploratory study toward a better understanding of the mechanism of RAG systems.
We focus on four design factors: retrieval document type, retrieval recall, document selection, and prompt techniques.
Our study uncovers how each factor impacts system correctness and confidence, providing valuable insights for developing an accurate and reliable RAG system.
- Score: 7.457969700747294
- License:
- Abstract: Retrieval-Augmented Generation (RAG) is a pivotal technique for enhancing the capability of large language models (LLMs) and has demonstrated promising efficacy across a diverse spectrum of tasks. While LLM-driven RAG systems show superior performance, they face unique challenges in stability and reliability. Their complexity hinders developers' efforts to design, maintain, and optimize effective RAG systems. Therefore, it is crucial to understand how RAG's performance is impacted by its design. In this work, we conduct an early exploratory study toward a better understanding of the mechanism of RAG systems, covering three code datasets, three QA datasets, and two LLMs. We focus on four design factors: retrieval document type, retrieval recall, document selection, and prompt techniques. Our study uncovers how each factor impacts system correctness and confidence, providing valuable insights for developing an accurate and reliable RAG system. Based on these findings, we present nine actionable guidelines for detecting defects and optimizing the performance of RAG systems. We hope our early exploration can inspire further advancements in engineering, improving and maintaining LLM-driven intelligent software systems for greater efficiency and reliability.
Related papers
- Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.36487127683053]
Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC)
RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks.
Despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including privacy concerns, adversarial attacks, and accountability issues.
arXiv Detail & Related papers (2025-02-08T06:50:47Z) - RbFT: Robust Fine-tuning for Retrieval-Augmented Generation against Retrieval Defects [12.5122702720856]
We propose Robust Fine-Tuning (RbFT) to enhance the resilience of large language models against retrieval defects.
Experimental results demonstrate that RbFT significantly improves the robustness of RAG systems across diverse retrieval conditions.
arXiv Detail & Related papers (2025-01-30T14:15:09Z) - Enhancing Retrieval-Augmented Generation: A Study of Best Practices [16.246719783032436]
We develop advanced RAG system designs that incorporate query expansion, various novel retrieval strategies, and a novel Contrastive In-Context Learning RAG.
Our study systematically investigates key factors, including language model size, prompt design, document chunk size, knowledge base size, retrieval stride, query expansion techniques, and Focus Mode retrieving relevant context at sentence-level.
Our findings offer actionable insights for developing RAG systems, striking a balance between contextual richness and retrieval-generation efficiency.
arXiv Detail & Related papers (2025-01-13T15:07:55Z) - Unanswerability Evaluation for Retrieval Augmented Generation [74.3022365715597]
UAEval4RAG is a framework designed to evaluate whether RAG systems can handle unanswerable queries effectively.
We define a taxonomy with six unanswerable categories, and UAEval4RAG automatically synthesizes diverse and challenging queries.
arXiv Detail & Related papers (2024-12-16T19:11:55Z) - Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective [48.40768048080928]
Retrieval-Augmented Generation (RAG) systems have shown promise in enhancing the performance of Large Language Models (LLMs)
This work aims to provide a systematic study on knowledge checking in RAG systems.
arXiv Detail & Related papers (2024-11-21T20:39:13Z) - Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs [64.9693406713216]
Internal mechanisms that contribute to the effectiveness of RAG systems remain underexplored.
Our experiments reveal that several core groups of experts are primarily responsible for RAG-related behaviors.
We propose several strategies to enhance RAG's efficiency and effectiveness through expert activation.
arXiv Detail & Related papers (2024-10-20T16:08:54Z) - Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - VERA: Validation and Evaluation of Retrieval-Augmented Systems [5.709401805125129]
VERA is a framework designed to enhance the transparency and reliability of outputs from large language models (LLMs)
We show how VERA can strengthen decision-making processes and trust in AI applications.
arXiv Detail & Related papers (2024-08-16T21:59:59Z) - RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation [8.377398103067508]
We introduce RAG Foundry, an open-source framework for augmenting large language models for RAG use cases.
RAG Foundry integrates data creation, training, inference and evaluation into a single workflow.
We demonstrate the framework effectiveness by augmenting and fine-tuning Llama-3 and Phi-3 models with diverse RAG configurations.
arXiv Detail & Related papers (2024-08-05T15:16:24Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [69.4501863547618]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.
With a focus on factual accuracy, we propose three novel metrics Completeness, Hallucination, and Irrelevance.
Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.