Related papers: RAGExplorer: A Visual Analytics System for the Comparative Diagnosis of RAG Systems

RAGExplorer: A Visual Analytics System for the Comparative Diagnosis of RAG Systems

URL: http://arxiv.org/abs/2601.12991v1
Date: Mon, 19 Jan 2026 12:09:56 GMT
Title: RAGExplorer: A Visual Analytics System for the Comparative Diagnosis of RAG Systems
Authors: Haoyu Tian, Yingchaojie Feng, Zhen Wen, Haoxuan Li, Minfeng Zhu, Wei Chen,
Abstract summary: RAGExplorer is a visual analytics system for the systematic comparison and diagnosis of RAG configurations.<n>We demonstrate the effectiveness of RAGExplorer through detailed case studies and user studies.
Score: 12.726326169727733
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The advent of Retrieval-Augmented Generation (RAG) has significantly enhanced the ability of Large Language Models (LLMs) to produce factually accurate and up-to-date responses. However, the performance of a RAG system is not determined by a single component but emerges from a complex interplay of modular choices, such as embedding models and retrieval algorithms. This creates a vast and often opaque configuration space, making it challenging for developers to understand performance trade-offs and identify optimal designs. To address this challenge, we present RAGExplorer, a visual analytics system for the systematic comparison and diagnosis of RAG configurations. RAGExplorer guides users through a seamless macro-to-micro analytical workflow. Initially, it empowers developers to survey the performance landscape across numerous configurations, allowing for a high-level understanding of which design choices are most effective. For a deeper analysis, the system enables users to drill down into individual failure cases, investigate how differences in retrieved information contribute to errors, and interactively test hypotheses by manipulating the provided context to observe the resulting impact on the generated answer. We demonstrate the effectiveness of RAGExplorer through detailed case studies and user studies, validating its ability to empower developers in navigating the complex RAG design space. Our code and user guide are publicly available at https://github.com/Thymezzz/RAGExplorer.

Related papers

Multi-hop Reasoning via Early Knowledge Alignment [68.28168992785896]
Early Knowledge Alignment (EKA) aims to align Large Language Models with contextually relevant retrieved knowledge.<n>EKA significantly improves retrieval precision, reduces cascading errors, and enhances both performance and efficiency.<n>EKA proves effective as a versatile, training-free inference strategy that scales seamlessly to large models.
arXiv Detail & Related papers (2025-12-23T08:14:44Z)
MRAG-Suite: A Diagnostic Evaluation Platform for Visual Retrieval-Augmented Generation [1.9036571490366498]
We propose MRAG-Suite, a diagnostic evaluation platform integrating diverse multimodal benchmarks.<n>We introduce difficulty-based and ambiguity-aware filtering strategies, alongside MM-RAGChecker, a claim-level diagnostic tool.
arXiv Detail & Related papers (2025-09-29T03:55:28Z)
Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs [69.10441885629787]
Retrieval-Augmented Generation (RAG) lifts the factuality of Large Language Models (LLMs) by injecting external knowledge.<n>It falls short on problems that demand multi-step inference; conversely, purely reasoning-oriented approaches often hallucinate or mis-ground facts.<n>This survey synthesizes both strands under a unified reasoning-retrieval perspective.
arXiv Detail & Related papers (2025-07-13T03:29:41Z)
Enhancing Retrieval-Augmented Generation: A Study of Best Practices [16.246719783032436]
We develop advanced RAG system designs that incorporate query expansion, various novel retrieval strategies, and a novel Contrastive In-Context Learning RAG.<n>Our study systematically investigates key factors, including language model size, prompt design, document chunk size, knowledge base size, retrieval stride, query expansion techniques, and Focus Mode retrieving relevant context at sentence-level.<n>Our findings offer actionable insights for developing RAG systems, striking a balance between contextual richness and retrieval-generation efficiency.
arXiv Detail & Related papers (2025-01-13T15:07:55Z)
XRAG: eXamining the Core -- Benchmarking Foundational Components in Advanced Retrieval-Augmented Generation [36.84847781022757]
Retrieval-augmented generation (RAG) synergizes the retrieval of pertinent data with the generative capabilities of Large Language Models (LLMs)<n>We introduce XRAG, an open-source, modular that facilitates exhaustive evaluation of the performance of foundational components of advanced RAG modules.
arXiv Detail & Related papers (2024-12-20T03:37:07Z)
Unanswerability Evaluation for Retrieval Augmented Generation [74.3022365715597]
UAEval4RAG is a framework designed to evaluate whether RAG systems can handle unanswerable queries effectively.<n>We define a taxonomy with six unanswerable categories, and UAEval4RAG automatically synthesizes diverse and challenging queries.
arXiv Detail & Related papers (2024-12-16T19:11:55Z)
Multi-Source Knowledge Pruning for Retrieval-Augmented Generation: A Benchmark and Empirical Study [45.69867169347836]
Retrieval-augmented generation (RAG) is an effective approach to mitigating the hallucination of large language models (LLMs) through the integration of external knowledge.<n>In this paper, we standardize a benchmark dataset that combines structured and unstructured knowledge across diverse domains.<n>We also develop a plug-and-play RAG framework, textbfPruningRAG, whose main characteristic is the use of multi-granularity pruning strategies.
arXiv Detail & Related papers (2024-09-03T03:31:37Z)
RAGChecker: A Fine-grained Framework for Diagnosing Retrieval-Augmented Generation [61.14660526363607]
We propose a fine-grained evaluation framework, RAGChecker, that incorporates a suite of diagnostic metrics for both the retrieval and generation modules. RAGChecker has significantly better correlations with human judgments than other evaluation metrics. The metrics of RAGChecker can guide researchers and practitioners in developing more effective RAG systems.
arXiv Detail & Related papers (2024-08-15T10:20:54Z)
FlashRAG: A Modular Toolkit for Efficient Retrieval-Augmented Generation Research [70.6584488911715]
retrieval-augmented generation (RAG) has attracted considerable research attention.<n>Existing RAG toolkits are often heavy and inflexibly, failing to meet the customization needs of researchers.<n>Our toolkit has implemented 16 advanced RAG methods and gathered and organized 38 benchmark datasets.
arXiv Detail & Related papers (2024-05-22T12:12:40Z)
RAGGED: Towards Informed Design of Scalable and Stable RAG Systems [51.171355532527365]
Retrieval-augmented generation (RAG) enhances language models by integrating external knowledge.<n>RAGGED is a framework for systematically evaluating RAG systems.
arXiv Detail & Related papers (2024-03-14T02:26:31Z)
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models [49.16989035566899]
Retrieval-Augmented Generation (RAG) is a technique that enhances the capabilities of large language models (LLMs) by incorporating external knowledge sources. This paper constructs a large-scale and more comprehensive benchmark, and evaluates all the components of RAG systems in various RAG application scenarios.
arXiv Detail & Related papers (2024-01-30T14:25:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.