Construct, Align, and Reason: Large Ontology Models for Enterprise Knowledge Management
- URL: http://arxiv.org/abs/2602.00029v1
- Date: Sun, 18 Jan 2026 05:30:03 GMT
- Title: Construct, Align, and Reason: Large Ontology Models for Enterprise Knowledge Management
- Authors: Yao Zhang, Hongyin Zhu,
- Abstract summary: Traditional knowledge graphs struggle with implicit relationship discovery and lack sufficient semantic understanding for complex question answering.<n>We introduce a unified construct--align--reason framework, the large ontology model (LOM)<n>We propose a unified three-stage training pipeline: ontology instruction fine-tuning to improve structural understanding; text-ontology grounding to strengthen node semantic encoding; and multi-task instruction tuning on ontology-language pairs with curriculum learning to enhance semantic reasoning.
- Score: 9.901440359781732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Enterprise-scale knowledge management faces significant challenges in integrating multi-source heterogeneous data and enabling effective semantic reasoning. Traditional knowledge graphs often struggle with implicit relationship discovery and lack sufficient semantic understanding for complex question answering. To address these limitations, we introduce a unified construct--align--reason framework, the large ontology model (LOM). We first build a dual-layer enterprise ontology from structured databases and unstructured text, subsequently fusing these sources into a comprehensive enterprise ontology. To enable instruction-aligned reasoning, we propose a unified three-stage training pipeline: ontology instruction fine-tuning to improve structural understanding; text-ontology grounding to strengthen node semantic encoding; and multi-task instruction tuning on ontology-language pairs with curriculum learning to enhance semantic reasoning and generation. We also construct comprehensive training and evaluation datasets covering diverse ontology reasoning tasks. On this benchmark, our 4B-parameter LOM achieves 89.47% accuracy and outperforms DeepSeek-V3.2 on complex graph reasoning, indicating effective fusion of ontology structure and language.
Related papers
- SLiNT: Structure-aware Language Model with Injection and Contrastive Training for Knowledge Graph Completion [11.686307370683922]
Link prediction in knowledge graphs requires integrating structural information and semantic context to infer missing entities.<n>We propose SLiNT, a modular framework that injects knowledge-graph-derived structural context into a frozen backbone with lightweight LoRA-based adaptation for robust link prediction.<n>Experiments on WN18RR and FB15k-237 show that SLiNT achieves superior or competitive performance compared with both embedding-based and generation-based baselines.
arXiv Detail & Related papers (2025-09-08T10:36:49Z) - Knowledge Graph-Infused Fine-Tuning for Structured Reasoning in Large Language Models [41.59092188743925]
It proposes a fine-tuning algorithm framework based on knowledge graph injection.<n>It builds on pretrained language models and introduces structured graph information for auxiliary learning.<n>It demonstrates better semantic consistency and contextual logic modeling in scenarios involving structural reasoning and entity extraction.
arXiv Detail & Related papers (2025-08-20T04:52:12Z) - OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases [38.58409057214189]
textbftextscOneEval is a benchmark to assess the knowledge-intensive reasoning capabilities of Large Language Models (LLMs)<n>textscOneEval comprises 4,019 carefully curated instances and includes a challenging subset, textscOneEvaltextsubscriptHard, consisting of 1,285 particularly difficult cases.<n>We release the textscOneEval datasets, evaluation scripts, and baseline results publicly, accompanied by a leaderboard to facilitate ongoing advancements in structured knowledge reasoning.
arXiv Detail & Related papers (2025-06-14T17:16:05Z) - Beyond Chunking: Discourse-Aware Hierarchical Retrieval for Long Document Question Answering [51.7493726399073]
We present a discourse-aware hierarchical framework to enhance long document question answering.<n>The framework involves three key innovations: specialized discourse parsing for lengthy documents, LLM-based enhancement of discourse relation nodes, and structure-guided hierarchical retrieval.
arXiv Detail & Related papers (2025-05-26T14:45:12Z) - A Graph Perspective to Probe Structural Patterns of Knowledge in Large Language Models [52.52824699861226]
Large language models have been extensively studied as neural knowledge bases for their knowledge access, editability, reasoning, and explainability.<n>We quantify the knowledge of LLMs at both the triplet and entity levels, and analyze how it relates to graph structural properties such as node degree.
arXiv Detail & Related papers (2025-05-25T19:34:15Z) - StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization [94.31508613367296]
Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs)
We propose StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure.
Experiments show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios.
arXiv Detail & Related papers (2024-10-11T13:52:44Z) - GIVE: Structured Reasoning of Large Language Models with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning method that merges parametric and non-parametric memories to improve accurate reasoning with minimal external input.<n>GIVE guides the LLM agent to select the most pertinent expert data (observe), engage in query-specific divergent thinking (reflect), and then synthesize this information to produce the final output (speak)
arXiv Detail & Related papers (2024-10-11T03:05:06Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Unifying Structure Reasoning and Language Model Pre-training for Complex
Reasoning [26.811507121199323]
This paper proposes a unified learning framework that combines explicit structure reasoning and language pre-training to endow PLMs with the structure reasoning skill.
It first identifies several elementary structures within contexts to construct structured queries and performs step-by-step reasoning along the queries to identify the answer entity.
Experimental results on four datasets demonstrate that the proposed model achieves significant improvements in complex reasoning tasks involving diverse structures.
arXiv Detail & Related papers (2023-01-21T08:18:11Z) - Schema-aware Reference as Prompt Improves Data-Efficient Knowledge Graph
Construction [57.854498238624366]
We propose a retrieval-augmented approach, which retrieves schema-aware Reference As Prompt (RAP) for data-efficient knowledge graph construction.
RAP can dynamically leverage schema and knowledge inherited from human-annotated and weak-supervised data as a prompt for each sample.
arXiv Detail & Related papers (2022-10-19T16:40:28Z) - Joint Language Semantic and Structure Embedding for Knowledge Graph
Completion [66.15933600765835]
We propose to jointly embed the semantics in the natural language description of the knowledge triplets with their structure information.
Our method embeds knowledge graphs for the completion task via fine-tuning pre-trained language models.
Our experiments on a variety of knowledge graph benchmarks have demonstrated the state-of-the-art performance of our method.
arXiv Detail & Related papers (2022-09-19T02:41:02Z) - A Knowledge-Enhanced Adversarial Model for Cross-lingual Structured
Sentiment Analysis [31.05169054736711]
Cross-lingual structured sentiment analysis task aims to transfer the knowledge from source language to target one.
We propose a Knowledge-Enhanced Adversarial Model (textttKEAM) with both implicit distributed and explicit structural knowledge.
We conduct experiments on five datasets and compare textttKEAM with both the supervised and unsupervised methods.
arXiv Detail & Related papers (2022-05-31T03:07:51Z) - ERICA: Improving Entity and Relation Understanding for Pre-trained
Language Models via Contrastive Learning [97.10875695679499]
We propose a novel contrastive learning framework named ERICA in pre-training phase to obtain a deeper understanding of the entities and their relations in text.
Experimental results demonstrate that our proposed ERICA framework achieves consistent improvements on several document-level language understanding tasks.
arXiv Detail & Related papers (2020-12-30T03:35:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.