A Systematic Review of Key Retrieval-Augmented Generation (RAG) Systems: Progress, Gaps, and Future Directions
- URL: http://arxiv.org/abs/2507.18910v1
- Date: Fri, 25 Jul 2025 03:05:46 GMT
- Title: A Systematic Review of Key Retrieval-Augmented Generation (RAG) Systems: Progress, Gaps, and Future Directions
- Authors: Agada Joseph Oche, Ademola Glory Folashade, Tirthankar Ghosal, Arpan Biswas,
- Abstract summary: Retrieval-Augmented Generation (RAG) is a major advancement in natural language processing (NLP)<n>RAG combines large language models (LLMs) with information retrieval systems to enhance factual grounding, accuracy, and contextual relevance.<n>This paper presents a systematic review of RAG, tracing its evolution from early developments in open domain question answering to recent state-of-the-art implementations.
- Score: 1.4931265249949528
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Retrieval-Augmented Generation (RAG) represents a major advancement in natural language processing (NLP), combining large language models (LLMs) with information retrieval systems to enhance factual grounding, accuracy, and contextual relevance. This paper presents a comprehensive systematic review of RAG, tracing its evolution from early developments in open domain question answering to recent state-of-the-art implementations across diverse applications. The review begins by outlining the motivations behind RAG, particularly its ability to mitigate hallucinations and outdated knowledge in parametric models. Core technical components-retrieval mechanisms, sequence-to-sequence generation models, and fusion strategies are examined in detail. A year-by-year analysis highlights key milestones and research trends, providing insight into RAG's rapid growth. The paper further explores the deployment of RAG in enterprise systems, addressing practical challenges related to retrieval of proprietary data, security, and scalability. A comparative evaluation of RAG implementations is conducted, benchmarking performance on retrieval accuracy, generation fluency, latency, and computational efficiency. Persistent challenges such as retrieval quality, privacy concerns, and integration overhead are critically assessed. Finally, the review highlights emerging solutions, including hybrid retrieval approaches, privacy-preserving techniques, optimized fusion strategies, and agentic RAG architectures. These innovations point toward a future of more reliable, efficient, and context-aware knowledge-intensive NLP systems.
Related papers
- Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z) - Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers [0.0]
Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm to enhance large language models.<n>RAG introduces new challenges in retrieval quality, grounding fidelity, pipeline efficiency, and robustness against noisy or adversarial inputs.<n>This survey aims to consolidate current knowledge in RAG research and serve as a foundation for the next generation of retrieval-augmented language modeling systems.
arXiv Detail & Related papers (2025-05-28T22:57:04Z) - Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook [85.43403500874889]
Retrieval-augmented generation (RAG) has emerged as a pivotal technique in artificial intelligence (AI)<n>Recent advancements in RAG for embodied AI, with a particular focus on applications in planning, task execution, multimodal perception, interaction, and specialized domains.
arXiv Detail & Related papers (2025-03-23T10:33:28Z) - A Survey on Knowledge-Oriented Retrieval-Augmented Generation [45.65542434522205]
Retrieval-Augmented Generation (RAG) has gained significant attention in recent years.<n>RAG combines large-scale retrieval systems with generative models.<n>We discuss the key characteristics of RAG, such as its ability to augment generative models with dynamic external knowledge.
arXiv Detail & Related papers (2025-03-11T01:59:35Z) - Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey [92.36487127683053]
Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC)<n>RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks.<n>Despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including privacy concerns, adversarial attacks, and accountability issues.
arXiv Detail & Related papers (2025-02-08T06:50:47Z) - A Comprehensive Survey of Retrieval-Augmented Generation (RAG): Evolution, Current Landscape and Future Directions [0.0]
RAG combines retrieval mechanisms with generative language models to enhance the accuracy of outputs.
Recent research breakthroughs are discussed, highlighting novel methods for improving retrieval efficiency.
Future research directions are proposed, focusing on improving the robustness of RAG models.
arXiv Detail & Related papers (2024-10-03T22:29:47Z) - Trustworthiness in Retrieval-Augmented Generation Systems: A Survey [59.26328612791924]
Retrieval-Augmented Generation (RAG) has quickly grown into a pivotal paradigm in the development of Large Language Models (LLMs)
We propose a unified framework that assesses the trustworthiness of RAG systems across six key dimensions: factuality, robustness, fairness, transparency, accountability, and privacy.
arXiv Detail & Related papers (2024-09-16T09:06:44Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [66.93260816493553]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.<n>With a focus on factual accuracy, we propose three novel metrics: Completeness, Hallucination, and Irrelevance.<n> Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - RAGGED: Towards Informed Design of Scalable and Stable RAG Systems [51.171355532527365]
Retrieval-augmented generation (RAG) enhances language models by integrating external knowledge.<n>RAGGED is a framework for systematically evaluating RAG systems.
arXiv Detail & Related papers (2024-03-14T02:26:31Z) - Retrieval-Augmented Generation for AI-Generated Content: A Survey [38.50754568320154]
Retrieval-Augmented Generation (RAG) has emerged as a paradigm to address such challenges.
RAG introduces the information retrieval process, which enhances the generation process by retrieving relevant objects from available data stores.
In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios.
arXiv Detail & Related papers (2024-02-29T18:59:01Z) - Retrieval-Augmented Generation for Large Language Models: A Survey [17.82361213043507]
Large Language Models (LLMs) showcase impressive capabilities but encounter challenges like hallucination.
Retrieval-Augmented Generation (RAG) has emerged as a promising solution by incorporating knowledge from external databases.
arXiv Detail & Related papers (2023-12-18T07:47:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.