Related papers: TrustRAG: An Information Assistant with Retrieval Augmented Generation

TrustRAG: An Information Assistant with Retrieval Augmented Generation

URL: http://arxiv.org/abs/2502.13719v1
Date: Wed, 19 Feb 2025 13:45:27 GMT
Title: TrustRAG: An Information Assistant with Retrieval Augmented Generation
Authors: Yixing Fan, Qiang Yan, Wenshan Wang, Jiafeng Guo, Ruqing Zhang, Xueqi Cheng,
Abstract summary: TrustRAG is a novel framework that enhances acRAG from three perspectives: indexing, retrieval, and generation.<n>We open-source the TrustRAG framework and provide a demonstration studio designed for excerpt-based question answering tasks.
Score: 73.84864898280719
License: http://creativecommons.org/licenses/by/4.0/
Abstract: \Ac{RAG} has emerged as a crucial technique for enhancing large models with real-time and domain-specific knowledge. While numerous improvements and open-source tools have been proposed to refine the \ac{RAG} framework for accuracy, relatively little attention has been given to improving the trustworthiness of generated results. To address this gap, we introduce TrustRAG, a novel framework that enhances \ac{RAG} from three perspectives: indexing, retrieval, and generation. Specifically, in the indexing stage, we propose a semantic-enhanced chunking strategy that incorporates hierarchical indexing to supplement each chunk with contextual information, ensuring semantic completeness. In the retrieval stage, we introduce a utility-based filtering mechanism to identify high-quality information, supporting answer generation while reducing input length. In the generation stage, we propose fine-grained citation enhancement, which detects opinion-bearing sentences in responses and infers citation relationships at the sentence-level, thereby improving citation accuracy. We open-source the TrustRAG framework and provide a demonstration studio designed for excerpt-based question answering tasks \footnote{https://huggingface.co/spaces/golaxy/TrustRAG}. Based on these, we aim to help researchers: 1) systematically enhancing the trustworthiness of \ac{RAG} systems and (2) developing their own \ac{RAG} systems with more reliable outputs.

Related papers

Disco-RAG: Discourse-Aware Retrieval-Augmented Generation [81.53888908988756]
We propose Disco-RAG, a discourse-aware framework that injects discourse signals into the generation process.<n>Our method constructs intra-chunk discourse trees to capture local hierarchies and builds inter-chunk rhetorical graphs to model cross-passage coherence.<n>Experiments on question answering and long-document summarization benchmarks show the efficacy of our approach.
arXiv Detail & Related papers (2026-01-07T20:32:50Z)
HKRAG: Holistic Knowledge Retrieval-Augmented Generation Over Visually-Rich Documents [18.42875699937102]
We propose HKRAG, a new holistic RAG framework designed to explicitly capture and integrate both knowledge types.<n>Our framework features two key components: (1) a Hybrid Masking-based Holistic Retriever that employs explicit masking strategies to separately model salient and fine-print knowledge, ensuring a query-relevant holistic information retrieval; and (2) an Uncertainty-guided Agentic Generator that dynamically assesses the uncertainty of initial answers and actively decides how to integrate the two distinct knowledge streams for optimal response generation.
arXiv Detail & Related papers (2025-11-25T11:59:52Z)
TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework [62.66056331998838]
TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
arXiv Detail & Related papers (2025-11-07T16:08:34Z)
Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering [49.43814054718318]
Multi-hop question answering (MHQA) requires integrating knowledge scattered across multiple passages to derive the correct answer.<n>Traditional retrieval-augmented generation (RAG) methods primarily focus on coarse-grained textual semantic similarity.<n>We propose a novel RAG approach called HGRAG for MHQA that achieves cross-granularity integration of structural and semantic information via hypergraphs.
arXiv Detail & Related papers (2025-08-15T06:36:13Z)
Enhancing Retrieval Augmented Generation with Hierarchical Text Segmentation Chunking [0.9968037829925942]
This paper proposes a novel framework that enhances RAG by integrating hierarchical text segmentation and clustering.<n>During inference, the framework retrieves information by leveraging both segment-level and cluster-level vector representations.<n> Evaluations on the NarrativeQA, QuALITY, and QASPER datasets indicate that the proposed method achieved improved results compared to traditional chunking techniques.
arXiv Detail & Related papers (2025-07-14T05:21:58Z)
Respecting Temporal-Causal Consistency: Entity-Event Knowledge Graphs for Retrieval-Augmented Generation [69.45495166424642]
We develop a robust and discriminative QA benchmark to measure temporal, causal, and character consistency understanding in narrative documents.<n>We then introduce Entity-Event RAG (E2RAG), a dual-graph framework that keeps separate entity and event subgraphs linked by a bipartite mapping.<n>Across ChronoQA, our approach outperforms state-of-the-art unstructured and KG-based RAG baselines, with notable gains on causal and character consistency queries.
arXiv Detail & Related papers (2025-06-06T10:07:21Z)
DO-RAG: A Domain-Specific QA Framework Using Knowledge Graph-Enhanced Retrieval-Augmented Generation [4.113142669523488]
Domain-specific QA systems require generative fluency but high factual accuracy grounded in structured expert knowledge.<n>We propose DO-RAG, a scalable and customizable hybrid QA framework that integrates multi-level knowledge graph construction with semantic vector retrieval.
arXiv Detail & Related papers (2025-05-17T06:40:17Z)
LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification [63.07563443280147]
We propose a novel framework named LATex for AG-ReID. It adopts prompt-tuning strategies to leverage attribute-based text knowledge. Our framework can fully leverage attribute-based text knowledge to improve the AG-ReID.
arXiv Detail & Related papers (2025-03-31T04:47:05Z)
MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG [65.0423152595537]
We propose MES-RAG, which enhances entity-specific query handling and provides accurate, secure, and consistent responses. MES-RAG introduces proactive security measures that ensure system integrity by applying protections prior to data access. Experimental results demonstrate that MES-RAG significantly improves both accuracy and recall, highlighting its effectiveness in advancing the security and utility of question-answering.
arXiv Detail & Related papers (2025-03-17T08:09:42Z)
CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs [9.718354494802002]
Contextualized Graph Retrieval-Augmented Generation (CG-RAG) is a novel framework that integrates sparse and dense retrieval signals within graph structures. First, we propose a contextual graph representation for citation graphs, effectively capturing both explicit and implicit connections within and across documents. Second, we introduce Lexical-Semantic Graph Retrieval (LeSeGR), which seamlessly integrates sparse and dense retrieval signals with graph encoding. Third, we present a context-aware generation strategy that utilizes the retrieved graph-structured information to generate precise and contextually enriched responses.
arXiv Detail & Related papers (2025-01-25T04:18:08Z)
TrustRAG: Enhancing Robustness and Trustworthiness in RAG [31.231916859341865]
TrustRAG is a framework that systematically filters compromised and irrelevant contents before they are retrieved for generation.<n>TrustRAG delivers substantial improvements in retrieval accuracy, efficiency, and attack resistance compared to existing approaches.
arXiv Detail & Related papers (2025-01-01T15:57:34Z)
Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation.<n>We introduce novel methodologies and datasets to overcome these challenges.<n>We propose MhBART, an encoder-decoder model designed to emulate human writing style.<n>We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z)
LLM-Ref: Enhancing Reference Handling in Technical Writing with Large Language Models [4.1180254968265055]
We present LLM-Ref, a writing assistant tool that aids researchers in writing articles from multiple source documents. Unlike traditional RAG systems that use chunking and indexing, our tool retrieves and generates content directly from text paragraphs. Our approach achieves a $3.25times$ to $6.26times$ increase in Ragas score, a comprehensive metric that provides a holistic view of a RAG system's ability to produce accurate, relevant, and contextually appropriate responses.
arXiv Detail & Related papers (2024-11-01T01:11:58Z)
GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation [3.2027710059627545]
We introduce Graphical Eigen Memories For Retrieval Augmented Generation (GEM-RAG) GEM-RAG works by tagging each chunk of text in a given text corpus with LLM generated utility'' questions. We evaluate GEM-RAG, using both UnifiedQA and GPT-3.5 Turbo as the LLMs, with SBERT, and OpenAI's text encoders on two standard QA tasks.
arXiv Detail & Related papers (2024-09-23T21:42:47Z)
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs) Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG. This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z)
Pistis-RAG: Enhancing Retrieval-Augmented Generation with Human Feedback [41.88662700261036]
RAG systems face limitations when semantic relevance alone does not guarantee improved generation quality. We propose Pistis-RAG, a new RAG framework designed with a content-centric approach to better align LLMs with human preferences.
arXiv Detail & Related papers (2024-06-21T08:52:11Z)
Towards Verifiable Generation: A Benchmark for Knowledge-aware Language Model Attribution [48.86322922826514]
This paper defines a new task of Knowledge-aware Language Model Attribution (KaLMA) First, we extend attribution source from unstructured texts to Knowledge Graph (KG), whose rich structures benefit both the attribution performance and working scenarios. Second, we propose a new Conscious Incompetence" setting considering the incomplete knowledge repository. Third, we propose a comprehensive automatic evaluation metric encompassing text quality, citation quality, and text citation alignment.
arXiv Detail & Related papers (2023-10-09T11:45:59Z)
Robust Saliency-Aware Distillation for Few-shot Fine-grained Visual Recognition [57.08108545219043]
Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision. Existing literature addresses this challenge by employing local-based representation approaches. This article proposes a novel model, Robust Saliency-aware Distillation (RSaD), for few-shot fine-grained visual recognition.
arXiv Detail & Related papers (2023-05-12T00:13:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.