Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models
- URL: http://arxiv.org/abs/2512.00590v1
- Date: Sat, 29 Nov 2025 18:44:25 GMT
- Title: Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models
- Authors: Alla Chepurova, Aydar Bulatov, Yuri Kuratov, Mikhail Burtsev,
- Abstract summary: Knowledge graphs (KGs) provide structured, verifiable grounding for large language models (LLMs)<n>Current LLM-based systems commonly use KGs as auxiliary structures for text retrieval, leaving their intrinsic quality underexplored.<n>We propose Wikontic, a multi-stage pipeline that constructs KGs from open-domain text by extracting candidate triplets with qualifiers.
- Score: 10.130178524819536
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge graphs (KGs) provide structured, verifiable grounding for large language models (LLMs), but current LLM-based systems commonly use KGs as auxiliary structures for text retrieval, leaving their intrinsic quality underexplored. In this work, we propose Wikontic, a multi-stage pipeline that constructs KGs from open-domain text by extracting candidate triplets with qualifiers, enforcing Wikidata-based type and relation constraints, and normalizing entities to reduce duplication. The resulting KGs are compact, ontology-consistent, and well-connected; on MuSiQue, the correct answer entity appears in 96% of generated triplets. On HotpotQA, our triplets-only setup achieves 76.0 F1, and on MuSiQue 59.8 F1, matching or surpassing several retrieval-augmented generation baselines that still require textual context. In addition, Wikontic attains state-of-the-art information-retention performance on the MINE-1 benchmark (86%), outperforming prior KG construction methods. Wikontic is also efficient at build time: KG construction uses less than 1,000 output tokens, about 3$\times$ fewer than AriGraph and $<$1/20 of GraphRAG. The proposed pipeline enhances the quality of the generated KG and offers a scalable solution for leveraging structured knowledge in LLMs.
Related papers
- Can Knowledge-Graph-based Retrieval Augmented Generation Really Retrieve What You Need? [57.28763506780752]
GraphFlow is a framework that efficiently retrieves accurate and diverse knowledge required for real-world queries from text-rich KGs.<n>It outperforms strong KG-RAG baselines, including GPT-4o, by 10% on average in hit rate and recall.<n>It also shows strong generalization to unseen KGs, demonstrating its effectiveness and robustness.
arXiv Detail & Related papers (2025-10-18T17:06:49Z) - Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching [61.824094419641575]
Large Language Models (LLMs) struggle with hallucinations and factual errors in knowledge-intensive scenarios like knowledge graph question answering (KGQA)<n>We attribute this to the semantic gap between structured knowledge graphs (KGs) and unstructured queries, caused by inherent differences in their focuses and structures.<n>Existing methods usually employ resource-intensive, non-scalable reasoning on vanilla KGs, but overlook this gap.<n>We propose a flexible framework, Enrich-on-Graph (EoG), which leverages LLMs' prior knowledge to enrich KGs, bridge the semantic gap between graphs and queries.
arXiv Detail & Related papers (2025-09-25T06:48:52Z) - KG2QA: Knowledge Graph-enhanced Retrieval-augmented Generation for Communication Standards Question Answering [7.079181644378029]
KG2QA is a question answering framework that integrates fine-tuned large language models (LLMs) with a domain-specific knowledge graph (KG)<n>We construct a high-quality dataset of 6,587 QA pairs from ITU-T recommendations and fine-tune Qwen2.5-7B-Instruct.<n>In our KG-RAG pipeline, the fine-tuned LLMs first retrieves relevant knowledge from KG, enabling more accurate and factually grounded responses.
arXiv Detail & Related papers (2025-06-08T08:07:22Z) - Explore-Construct-Filter: An Automated Framework for Rich and Reliable API Knowledge Graph Construction [16.67905950921807]
API Knowledge Graph (API KG) is a structured network that models API entities and their relations.<n>We propose an automated approach for API KG construction based on large language models (LLMs)<n>Our method surpasses the state-of-the-art method, achieving a 25.2% improvement in F1 score.
arXiv Detail & Related papers (2025-02-19T03:51:31Z) - KG-CF: Knowledge Graph Completion with Context Filtering under the Guidance of Large Language Models [55.39134076436266]
KG-CF is a framework tailored for ranking-based knowledge graph completion tasks.<n> KG-CF leverages LLMs' reasoning abilities to filter out irrelevant contexts, achieving superior results on real-world datasets.
arXiv Detail & Related papers (2025-01-06T01:52:15Z) - Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema [60.42231674887294]
We propose an ontology-grounded approach to Knowledge Graph (KG) construction using Large Language Models (LLMs) on a knowledge base.<n>We ground generation of KG with the authored ontology based on extracted relations to ensure consistency and interpretability.<n>Our work presents a promising direction for scalable KG construction pipeline with minimal human intervention, that yields high quality and human-interpretable KGs.
arXiv Detail & Related papers (2024-12-30T13:36:05Z) - Can LLMs be Good Graph Judge for Knowledge Graph Construction? [18.05473053134776]
We propose textbfGraphJudge, a KG construction framework to address the aforementioned challenges.<n>In this framework, we designed an entity-centric strategy to eliminate the noise information in the documents.<n>And we fine-tuned a LLM as a graph judge to finally enhance the quality of generated KGs.
arXiv Detail & Related papers (2024-11-26T12:46:57Z) - Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective [13.905336639352404]
This work introduces Graphusion, a zero-shot Knowledge Graph framework from free text.<n>It contains three steps: in Step 1, we extract a list of seed entities using topic modeling to guide the final KG includes the most relevant entities.<n>In Step 2, we conduct candidate triplet extraction using LLMs; in Step 3, we design the novel fusion module that provides a global view of the extracted knowledge.
arXiv Detail & Related papers (2024-10-23T06:54:03Z) - Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs)
GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z) - KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using
Large Language Models [18.20425100517317]
We propose KG-GPT, a framework leveraging large language models for tasks employing knowledge graphs.
KG-GPT comprises three steps: Sentence, Graph Retrieval, and Inference, each aimed at partitioning sentences, retrieving relevant graph components, and deriving logical conclusions.
We evaluate KG-GPT using KG-based fact verification and KGQA benchmarks, with the model showing competitive and robust performance, even outperforming several fully-supervised models.
arXiv Detail & Related papers (2023-10-17T12:51:35Z) - BertNet: Harvesting Knowledge Graphs with Arbitrary Relations from
Pretrained Language Models [65.51390418485207]
We propose a new approach of harvesting massive KGs of arbitrary relations from pretrained LMs.
With minimal input of a relation definition, the approach efficiently searches in the vast entity pair space to extract diverse accurate knowledge.
We deploy the approach to harvest KGs of over 400 new relations from different LMs.
arXiv Detail & Related papers (2022-06-28T19:46:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.