RAGOps: Operating and Managing Retrieval-Augmented Generation Pipelines
- URL: http://arxiv.org/abs/2506.03401v1
- Date: Tue, 03 Jun 2025 21:22:23 GMT
- Title: RAGOps: Operating and Managing Retrieval-Augmented Generation Pipelines
- Authors: Xiwei Xu, Hans Weytjens, Dawen Zhang, Qinghua Lu, Ingo Weber, Liming Zhu,
- Abstract summary: Recent studies show that 60% of LLM-based compound systems in enterprise environments leverage some form of retrieval-augmented generation (RAG)<n>RAGOps extends LLMOps by incorporating a strong focus on data management to address the continuous changes in external data sources.
- Score: 14.352109014262991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies show that 60% of LLM-based compound systems in enterprise environments leverage some form of retrieval-augmented generation (RAG), which enhances the relevance and accuracy of LLM (or other genAI) outputs by retrieving relevant information from external data sources. LLMOps involves the practices and techniques for managing the lifecycle and operations of LLM compound systems in production environments. It supports enhancing LLM systems through continuous operations and feedback evaluation. RAGOps extends LLMOps by incorporating a strong focus on data management to address the continuous changes in external data sources. This necessitates automated methods for evaluating and testing data operations, enhancing retrieval relevance and generation quality. In this paper, we (1) characterize the generic architecture of RAG applications based on the 4+1 model view for describing software architectures, (2) outline the lifecycle of RAG systems, which integrates the management lifecycles of both the LLM and the data, (3) define the key design considerations of RAGOps across different stages of the RAG lifecycle and quality trade-off analyses, (4) highlight the overarching research challenges around RAGOps, and (5) present two use cases of RAG applications and the corresponding RAGOps considerations.
Related papers
- A Survey of AIOps in the Era of Large Language Models [60.59720351854515]
We analyzed 183 research papers published between January 2020 and December 2024 to answer four key research questions (RQs)<n>We discuss the state-of-the-art advancements and trends, identify gaps in existing research, and propose promising directions for future exploration.
arXiv Detail & Related papers (2025-06-23T02:40:16Z) - mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation [5.647319807077936]
Large Vision-Language Models (LVLMs) have made remarkable strides in multimodal tasks such as visual question answering, visual grounding, and complex reasoning.<n>Retrieval-Augmented Generation (RAG) offers a practical solution to mitigate these challenges by allowing the LVLMs to access large-scale knowledge databases via retrieval mechanisms.
arXiv Detail & Related papers (2025-05-29T23:32:03Z) - A Survey of LLM $\ imes$ DATA [71.96808497574658]
The integration of large language model (LLM) and data management ( DATA4LLM) is rapidly redefining both domains.<n>On the one hand, DATA4LLM feeds LLMs with high quality, diversity, and timeliness of data required for stages like pre-training, post-training, retrieval-augmented generation, and agentic.<n>On the other hand, LLMs are emerging as general-purpose engines for data management.
arXiv Detail & Related papers (2025-05-24T01:57:12Z) - Direct Retrieval-augmented Optimization: Synergizing Knowledge Selection and Language Models [83.8639566087953]
We propose a direct retrieval-augmented optimization framework, named DRO, that enables end-to-end training of two key components.<n>DRO alternates between two phases: (i) document permutation estimation and (ii) re-weighted, progressively improving RAG components.<n>Our theoretical analysis reveals that DRO is analogous to policy-gradient methods in reinforcement learning.
arXiv Detail & Related papers (2025-05-05T23:54:53Z) - Cross-Format Retrieval-Augmented Generation in XR with LLMs for Context-Aware Maintenance Assistance [6.16808916207942]
This paper presents a detailed evaluation of a Retrieval-Augmented Generation system that integrates large language models (LLMs)<n>We assess the performance of eight LLMs, emphasizing key metrics such as response speed and accuracy, which were quantified using BLEU and METEOR scores.<n>The results validate the system's ability to deliver timely and accurate responses, highlighting the potential of RAG frameworks to optimize maintenance operations.
arXiv Detail & Related papers (2025-02-21T17:19:39Z) - LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs -- No Silver Bullet for LC or RAG Routing [70.35888047551643]
We present LaRA, a novel benchmark specifically designed to rigorously compare RAG and LC LLMs.<n>LaRA encompasses 2326 test cases across four practical QA task categories and three types of naturally occurring long texts.<n>We find that the optimal choice between RAG and LC depends on a complex interplay of factors, including the model's parameter size, long-text capabilities, context length, task type, and the characteristics of the retrieved chunks.
arXiv Detail & Related papers (2025-02-14T08:04:22Z) - OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain [62.89809156574998]
We introduce an omnidirectional and automatic RAG benchmark, OmniEval, in the financial domain.<n>Our benchmark is characterized by its multi-dimensional evaluation framework.<n>Our experiments demonstrate the comprehensiveness of OmniEval, which includes extensive test datasets.
arXiv Detail & Related papers (2024-12-17T15:38:42Z) - RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework [66.93260816493553]
This paper introduces RAGEval, a framework designed to assess RAG systems across diverse scenarios.<n>With a focus on factual accuracy, we propose three novel metrics: Completeness, Hallucination, and Irrelevance.<n> Experimental results show that RAGEval outperforms zero-shot and one-shot methods in terms of clarity, safety, conformity, and richness of generated samples.
arXiv Detail & Related papers (2024-08-02T13:35:11Z) - Evaluating the Efficacy of Open-Source LLMs in Enterprise-Specific RAG Systems: A Comparative Study of Performance and Scalability [0.0]
This paper presents an analysis of open-source large language models (LLMs) and their application in Retrieval-Augmented Generation (RAG) tasks.
Our findings indicate that open-source LLMs, combined with effective embedding techniques, can significantly improve the accuracy and efficiency of RAG systems.
arXiv Detail & Related papers (2024-06-17T11:22:25Z) - Retrieval-Augmented Generation-based Relation Extraction [0.0]
Retrieved-Augmented Generation-based Relation Extraction (RAG4RE) is proposed to enhance the performance of relation extraction tasks.
This work evaluated the effectiveness of our RAG4RE approach utilizing different Large Language Models (LLMs)
The results of our study demonstrate that our RAG4RE approach surpasses performance of traditional RE approaches.
arXiv Detail & Related papers (2024-04-20T14:42:43Z) - Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases [9.478012553728538]
We propose an end-to-end system design towards utilizing Retrieval Augmented Generation (RAG) to improve the factual accuracy of Large Language Models (LLMs)
Our system integrates RAG pipeline with upstream datasets processing and downstream performance evaluation.
Our experiments demonstrate the system's effectiveness in generating more accurate answers to domain-specific and time-sensitive inquiries.
arXiv Detail & Related papers (2024-03-15T16:30:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.