SocraticKG: Knowledge Graph Construction via QA-Driven Fact Extraction
- URL: http://arxiv.org/abs/2601.10003v1
- Date: Thu, 15 Jan 2026 02:26:51 GMT
- Title: SocraticKG: Knowledge Graph Construction via QA-Driven Fact Extraction
- Authors: Sanghyeok Choi, Woosang Jeon, Kyuseok Yang, Taehyeong Kim,
- Abstract summary: We propose an automated KG construction method that introduces question-answer pairs as a structured intermediate representation.<n>SocraticKG captures contextual dependencies and implicit relational links typically lost in direct KG extraction pipelines.
- Score: 4.867319754310031
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Constructing Knowledge Graphs (KGs) from unstructured text provides a structured framework for knowledge representation and reasoning, yet current LLM-based approaches struggle with a fundamental trade-off: factual coverage often leads to relational fragmentation, while premature consolidation causes information loss. To address this, we propose SocraticKG, an automated KG construction method that introduces question-answer pairs as a structured intermediate representation to systematically unfold document-level semantics prior to triple extraction. By employing 5W1H-guided QA expansion, SocraticKG captures contextual dependencies and implicit relational links typically lost in direct KG extraction pipelines, providing explicit grounding in the source document that helps mitigate implicit reasoning errors. Evaluation on the MINE benchmark demonstrates that our approach effectively addresses the coverage-connectivity trade-off, achieving superior factual retention while maintaining high structural cohesion even as extracted knowledge volume substantially expands. These results highlight that QA-mediated semantic scaffolding plays a critical role in structuring semantics prior to KG extraction, enabling more coherent and reliable graph construction in subsequent stages.
Related papers
- Cumulative Path-Level Semantic Reasoning for Inductive Knowledge Graph Completion [9.623163073915741]
This paper proposes the Cumulative Path-Level Semantic Reasoning for inductive knowledge graph completion (CPSR) framework.<n>CPSR simultaneously captures both the structural and semantic information of KGs to enhance the inductive KGC task.
arXiv Detail & Related papers (2026-01-09T08:34:05Z) - StruProKGR: A Structural and Probabilistic Framework for Sparse Knowledge Graph Reasoning [68.58655814341996]
Sparse Knowledge Graphs (KGs) are commonly encountered in real-world applications, where knowledge is often incomplete or limited.<n>We propose a Structural and Probabilistic framework named StruProKGR, tailored for efficient and interpretable reasoning on sparse KGs.
arXiv Detail & Related papers (2025-12-14T09:36:58Z) - LINK-KG: LLM-Driven Coreference-Resolved Knowledge Graphs for Human Smuggling Networks [8.222584338135986]
Link-KG is a framework that integrates a three-stage, LLM-guided coreference resolution pipeline with downstream KG extraction.<n>At the core of our approach is a type-specific Prompt Cache, which consistently tracks and resolves references across document chunks.<n>Link-KG reduces average node duplication by 45.21% and noisy nodes by 32.22% compared to baseline methods.
arXiv Detail & Related papers (2025-10-30T13:39:08Z) - Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching [61.824094419641575]
Large Language Models (LLMs) struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA)<n>We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures.<n>Existing methods usually employ resource-intensive, non-scalable reasoning on vanilla KGs, but overlook this gap.<n>We propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries.
arXiv Detail & Related papers (2025-09-25T06:48:52Z) - Cross-Granularity Hypergraph Retrieval-Augmented Generation for Multi-hop Question Answering [49.43814054718318]
Multi-hop question answering (MHQA) requires integrating knowledge scattered across multiple passages to derive the correct answer.<n>Traditional retrieval-augmented generation (RAG) methods primarily focus on coarse-grained textual semantic similarity.<n>We propose a novel RAG approach called HGRAG for MHQA that achieves cross-granularity integration of structural and semantic information via hypergraphs.
arXiv Detail & Related papers (2025-08-15T06:36:13Z) - KG-Infused RAG: Augmenting Corpus-Based RAG with External Knowledge Graphs [58.12674907593879]
KG-Infused RAG is a framework that incorporates pre-existing large-scale knowledge graphs into RAG.<n> KG-Infused RAG directly performs spreading activation over external KGs to retrieve relevant structured knowledge.<n>Experiments show that KG-Infused RAG consistently outperforms vanilla RAG.
arXiv Detail & Related papers (2025-06-11T09:20:02Z) - LKD-KGC: Domain-Specific KG Construction via LLM-driven Knowledge Dependency Parsing [9.502380540548497]
Knowledge Graphs (KGs) structure real-world entities and their relationships into triples, enhancing machine reasoning for various tasks.<n>Recent approaches for knowledge graph construction based on large language models (LLMs) have proven efficient.<n>We propose LKD-KGC, a novel framework for unsupervised domain-specific KG construction.
arXiv Detail & Related papers (2025-05-30T03:10:23Z) - Knowledge Graph Completion with Relation-Aware Anchor Enhancement [50.50944396454757]
We propose a relation-aware anchor enhanced knowledge graph completion method (RAA-KGC)<n>We first generate anchor entities within the relation-aware neighborhood of the head entity.<n>Then, by pulling the query embedding towards the neighborhoods of the anchors, it is tuned to be more discriminative for target entity matching.
arXiv Detail & Related papers (2025-04-08T15:22:08Z) - How to Mitigate Information Loss in Knowledge Graphs for GraphRAG: Leveraging Triple Context Restoration and Query-Driven Feedback [12.250007669492753]
This paper proposes the Triple Context Restoration and Query-driven Feedback framework.<n>It reconstructs the textual context underlying each triple to mitigate information loss.<n>It achieves a 29.1% improvement in Exact Match and a 15.5% improvement in F1 over its state-of-the-art GraphRAG competitors.
arXiv Detail & Related papers (2025-01-26T03:27:11Z) - Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema [60.42231674887294]
We propose an ontology-grounded approach to Knowledge Graph (KG) construction using Large Language Models (LLMs) on a knowledge base.<n>We ground generation of KG with the authored ontology based on extracted relations to ensure consistency and interpretability.<n>Our work presents a promising direction for scalable KG construction pipeline with minimal human intervention, that yields high quality and human-interpretable KGs.
arXiv Detail & Related papers (2024-12-30T13:36:05Z) - Context Graph [8.02985792541121]
We present a context graph reasoning textbfCGR$3$ paradigm that leverages large language models (LLMs) to retrieve candidate entities and related contexts.
Our experimental results demonstrate that CGR$3$ significantly improves performance on KG completion (KGC) and KG question answering (KGQA) tasks.
arXiv Detail & Related papers (2024-06-17T02:59:19Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.