DSRAG: A Domain-Specific Retrieval Framework Based on Document-derived Multimodal Knowledge Graph
- URL: http://arxiv.org/abs/2509.10467v1
- Date: Fri, 22 Aug 2025 14:24:48 GMT
- Title: DSRAG: A Domain-Specific Retrieval Framework Based on Document-derived Multimodal Knowledge Graph
- Authors: Mengzheng Yang, Yanfei Ren, David Osei Opoku, Ruochang Li, Peng Ren, Chunxiao Xing,
- Abstract summary: This work focuses on a graph-based RAG framework, emphasizing the critical role of knowledge graph quality during the generation process.<n>We propose DSRAG, a multimodal knowledge graph-driven retrieval-augmented generation framework designed for domain-specific applications.
- Score: 4.951890767337337
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Current general-purpose large language models (LLMs) commonly exhibit knowledge hallucination and insufficient domain-specific adaptability in domain-specific tasks, limiting their effectiveness in specialized question answering scenarios. Retrieval-augmented generation (RAG) effectively tackles these challenges by integrating external knowledge to enhance accuracy and relevance. However, traditional RAG still faces limitations in domain knowledge accuracy and context modeling.To enhance domain-specific question answering performance, this work focuses on a graph-based RAG framework, emphasizing the critical role of knowledge graph quality during the generation process. We propose DSRAG (Domain-Specific RAG), a multimodal knowledge graph-driven retrieval-augmented generation framework designed for domain-specific applications. Our approach leverages domain-specific documents as the primary knowledge source, integrating heterogeneous information such as text, images, and tables to construct a multimodal knowledge graph covering both conceptual and instance layers. Building on this foundation, we introduce semantic pruning and structured subgraph retrieval mechanisms, combining knowledge graph context and vector retrieval results to guide the language model towards producing more reliable responses. Evaluations using the Langfuse multidimensional scoring mechanism show that our method excels in domain-specific question answering, validating the efficacy of integrating multimodal knowledge graphs with retrieval-augmented generation.
Related papers
- MegaRAG: Multimodal Knowledge Graph-Based Retrieval Augmented Generation [17.382062394739588]
Large language models (LLMs) struggle with high-level conceptual understanding and holistic comprehension due to limited context windows.<n>We introduce a multimodal knowledge graph-based RAG that enables cross-modal reasoning for better content understanding.<n>Our method incorporates visual cues into the construction of knowledge graphs, the retrieval phase, and the answer generation process.
arXiv Detail & Related papers (2025-11-26T05:00:03Z) - Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding [61.36285696607487]
Document understanding is critical for applications from financial analysis to scientific discovery.<n>Current approaches, whether OCR-based pipelines feeding Large Language Models (LLMs) or native Multimodal LLMs (MLLMs) face key limitations.<n>Retrieval-Augmented Generation (RAG) helps ground models in external data, but documents' multimodal nature, combining text, tables, charts, and layout, demands a more advanced paradigm: Multimodal RAG.
arXiv Detail & Related papers (2025-10-17T02:33:16Z) - Domain-Specific Data Generation Framework for RAG Adaptation [58.20906914537952]
Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models with external retrieval to enable domain-grounded responses.<n>We propose RAGen, a framework for generating domain-grounded question-answer-context (QAC) triples tailored to diverse RAG adaptation approaches.
arXiv Detail & Related papers (2025-10-13T09:59:49Z) - GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [59.72897499248909]
We propose a novel graph retriever trained end-to-end with Large Language Models (LLMs)<n>Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together.<n>Our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks.
arXiv Detail & Related papers (2025-09-20T02:38:00Z) - Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering [49.43814054718318]
Multi-hop question answering (MHQA) requires integrating knowledge scattered across multiple passages to derive the correct answer.<n>Traditional retrieval-augmented generation (RAG) methods primarily focus on coarse-grained textual semantic similarity.<n>We propose a novel RAG approach called HGRAG for MHQA that achieves cross-granularity integration of structural and semantic information via hypergraphs.
arXiv Detail & Related papers (2025-08-15T06:36:13Z) - MMGraphRAG: Bridging Vision and Language with Interpretable Multimodal Knowledge Graphs [6.165053219836395]
We propose MMGraphRAG, which refines visual content through scene graphs and constructs a multimodal knowledge graph.<n>It employs spectral clustering to achieve cross-modal entity linking and retrieves context along reasoning paths to guide the generative process.<n> Experimental results show that MMGraphRAG achieves state-of-the-art performance on the DocBench and MMLongBench datasets.
arXiv Detail & Related papers (2025-07-28T13:16:23Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - UniversalRAG: Retrieval-Augmented Generation over Corpora of Diverse Modalities and Granularities [53.76854299076118]
UniversalRAG is a novel RAG framework designed to retrieve and integrate knowledge from heterogeneous sources with diverse modalities and granularities.<n>We propose a modality-aware routing mechanism that dynamically identifies the most appropriate modality-specific corpus and performs targeted retrieval within it.<n>We validate UniversalRAG on 8 benchmarks spanning multiple modalities, showing its superiority over various modality-specific and unified baselines.
arXiv Detail & Related papers (2025-04-29T13:18:58Z) - Retrieval Augmented Generation and Understanding in Vision: A Survey and New Outlook [85.43403500874889]
Retrieval-augmented generation (RAG) has emerged as a pivotal technique in artificial intelligence (AI)<n>Recent advancements in RAG for embodied AI, with a particular focus on applications in planning, task execution, multimodal perception, interaction, and specialized domains.
arXiv Detail & Related papers (2025-03-23T10:33:28Z) - A Survey of Graph Retrieval-Augmented Generation for Customized Large Language Models [29.45576836135952]
Retrieval-Augmented generation (RAG) has emerged as a promising solution to customize large language models (LLMs)<n>This survey presents a systematic analysis of Graph-based Retrieval-Augmented Generation (GraphRAG)
arXiv Detail & Related papers (2025-01-21T06:25:21Z) - MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning [4.955697042432618]
This paper proposes a novel agent-enhanced model collaboration framework called MoColl.<n>MoColl decomposes complex image captioning tasks into a series of interconnected question-answer subtasks.<n> Experimental results on radiology report generation validate the effectiveness of the proposed framework.
arXiv Detail & Related papers (2025-01-03T14:38:01Z) - Retrieval-Augmented Generation with Graphs (GraphRAG) [84.29507404866257]
Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information.<n>Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information.<n>Unlike conventional RAG, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains.
arXiv Detail & Related papers (2024-12-31T06:59:35Z) - Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization [7.522493227357079]
Large Language Models (LLMs) are pre-trained on large-scale corpora.
LLMs suffer from hallucinations, knowledge cut-offs, and lack of knowledge attributions.
We introduce SMART-SLIC, a highly domain-specific LLM framework.
arXiv Detail & Related papers (2024-10-03T17:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.