Scalable and Explainable Enterprise Knowledge Discovery Using Graph-Centric Hybrid Retrieval
- URL: http://arxiv.org/abs/2510.10942v1
- Date: Mon, 13 Oct 2025 02:56:36 GMT
- Title: Scalable and Explainable Enterprise Knowledge Discovery Using Graph-Centric Hybrid Retrieval
- Authors: Nilima Rao, Jagriti Srivastava, Pradeep Kumar Sharma, Hritvik Shrivastava,
- Abstract summary: Modern enterprises manage vast knowledge distributed across heterogeneous systems such as Jira, Git repositories, Confluence, and wikis.<n>We present a modular hybrid retrieval framework that integrates Knowledge Base Language-Augmented Models (KBLam), DeepGraph representations, and embedding-driven semantic search.<n>The framework builds a unified knowledge graph from parsed repositories including code, pull requests, and commit histories.<n> Experiments on large-scale Git repositories show that the unified reasoning layer improves answer relevance by up to 80 percent compared with standalone GPT-based retrieval pipelines.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Modern enterprises manage vast knowledge distributed across heterogeneous systems such as Jira, Git repositories, Confluence, and wikis. Conventional retrieval methods based on keyword search or static embeddings often fail to answer complex queries that require contextual reasoning and multi-hop inference across artifacts. We present a modular hybrid retrieval framework for adaptive enterprise information access that integrates Knowledge Base Language-Augmented Models (KBLam), DeepGraph representations, and embedding-driven semantic search. The framework builds a unified knowledge graph from parsed repositories including code, pull requests, and commit histories, enabling semantic similarity search, structural inference, and multi-hop reasoning. Query analysis dynamically determines the optimal retrieval strategy, supporting both structured and unstructured data sources through independent or fused processing. An interactive interface provides graph visualizations, subgraph exploration, and context-aware query routing to generate concise and explainable answers. Experiments on large-scale Git repositories show that the unified reasoning layer improves answer relevance by up to 80 percent compared with standalone GPT-based retrieval pipelines. By combining graph construction, hybrid reasoning, and interactive visualization, the proposed framework offers a scalable, explainable, and user-centric foundation for intelligent knowledge assistants in enterprise environments.
Related papers
- Powering Job Search at Scale: LLM-Enhanced Query Understanding in Job Matching Systems [10.9341814749217]
We introduce a unified query understanding framework powered by a Large Language Model (LLM)<n>Our approach jointly models the user query and contextual signals such as profile attributes to generate structured interpretations.<n>The framework improves relevance quality in online A/B testing while significantly reducing system complexity.
arXiv Detail & Related papers (2025-08-19T21:35:43Z) - Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering [49.43814054718318]
Multi-hop question answering (MHQA) requires integrating knowledge scattered across multiple passages to derive the correct answer.<n>Traditional retrieval-augmented generation (RAG) methods primarily focus on coarse-grained textual semantic similarity.<n>We propose a novel RAG approach called HGRAG for MHQA that achieves cross-granularity integration of structural and semantic information via hypergraphs.
arXiv Detail & Related papers (2025-08-15T06:36:13Z) - LeanRAG: Knowledge-Graph-Based Generation with Semantic Aggregation and Hierarchical Retrieval [10.566901995776025]
LeanRAG is a framework that combines knowledge aggregation and retrieval strategies.<n>It can mitigate the substantial overhead associated with path retrieval on graphs and minimizes redundant information retrieval.
arXiv Detail & Related papers (2025-08-14T06:47:18Z) - Query-Aware Graph Neural Networks for Enhanced Retrieval-Augmented Generation [0.0]
We present a novel graph neural network architecture for retrieval-augmented generation (RAG)<n>Our approach constructs per-episode knowledge graphs that capture both sequential and semantic relationships between text chunks.<n>We introduce an Enhanced Graph Attention Network with query-guided pooling that dynamically focuses on relevant parts of the graph based on user queries.
arXiv Detail & Related papers (2025-07-25T19:42:27Z) - Mixture of Structural-and-Textual Retrieval over Text-rich Graph Knowledge Bases [78.62158923194153]
Text-rich Graph Knowledge Bases (TG-KBs) have become increasingly crucial for answering queries by providing textual and structural knowledge.<n>We propose a Mixture of Structural-and-Textual Retrieval (MoR) to retrieve these two types of knowledge via a Planning-Reasoning-Organizing framework.
arXiv Detail & Related papers (2025-02-27T17:42:52Z) - CAISSON: Concept-Augmented Inference Suite of Self-Organizing Neural Networks [0.0]
We present CAISSON, a novel hierarchical approach to Retrieval-Augmented Generation (RAG)<n>At its core, CAISSON leverages dual Self-Organizing Maps (SOMs) to create complementary organizational views of the document space.<n>To evaluate CAISSON, we develop SynFAQA, a framework for generating synthetic financial analyst notes and question-answer pairs.
arXiv Detail & Related papers (2024-12-03T21:00:10Z) - Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval [49.42043077545341]
We propose a knowledge-aware query expansion framework, augmenting LLMs with structured document relations from knowledge graph (KG)<n>We leverage document texts as rich KG node representations and use document-based relation filtering for our Knowledge-Aware Retrieval (KAR)
arXiv Detail & Related papers (2024-10-17T17:03:23Z) - Improving Retrieval in Sponsored Search by Leveraging Query Context Signals [6.152499434499752]
We propose an approach to enhance query understanding by augmenting queries with rich contextual signals.
We use web search titles and snippets to ground queries in real-world information and utilize GPT-4 to generate query rewrites and explanations.
Our context-aware approach substantially outperforms context-free models.
arXiv Detail & Related papers (2024-07-19T14:28:53Z) - STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases.
Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine.
We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z) - Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs [53.03085605769093]
We propose to learn Federated Neural Graph DataBase (FedNGDB), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data.<n>FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities, and improving the overall quality of graph data.
arXiv Detail & Related papers (2024-02-22T14:57:44Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Tree-Augmented Cross-Modal Encoding for Complex-Query Video Retrieval [98.62404433761432]
The rapid growth of user-generated videos on the Internet has intensified the need for text-based video retrieval systems.
Traditional methods mainly favor the concept-based paradigm on retrieval with simple queries.
We propose a Tree-augmented Cross-modal.
method by jointly learning the linguistic structure of queries and the temporal representation of videos.
arXiv Detail & Related papers (2020-07-06T02:50:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.