Large Language Models as Oracles for Ontology Alignment
- URL: http://arxiv.org/abs/2508.08500v1
- Date: Mon, 11 Aug 2025 22:16:20 GMT
- Title: Large Language Models as Oracles for Ontology Alignment
- Authors: Sviatoslav Lushnei, Dmytro Shumskyi, Severyn Shykula, Ernesto Jimenez-Ruiz, Artur d'Avila Garcez,
- Abstract summary: Ontology alignment plays a crucial role in integrating diverse data sources across domains.<n>Human-in-the-loop alignment is essential in applications requiring very accurate mappings.<n>Large Language Models (MLL) as an alternative to the domain expert.
- Score: 0.9786690381850356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ontology alignment plays a crucial role in integrating diverse data sources across domains. There is a large plethora of systems that tackle the ontology alignment problem, yet challenges persist in producing highly quality correspondences among a set of input ontologies. Human-in-the-loop during the alignment process is essential in applications requiring very accurate mappings. User involvement is, however, expensive when dealing with large ontologies. In this paper, we explore the feasibility of using Large Language Models (LLM) as an alternative to the domain expert. The use of the LLM focuses only on the validation of the subset of correspondences where an ontology alignment system is very uncertain. We have conducted an extensive evaluation over several matching tasks of the Ontology Alignment Evaluation Initiative (OAEI), analysing the performance of several state-of-the-art LLMs using different ontology-driven prompt templates. The LLM results are also compared against simulated Oracles with variable error rates.
Related papers
- Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs [78.09559830840595]
We present the first systematic study on quantizing diffusion-based language models.<n>We identify the presence of activation outliers, characterized by abnormally large activation values.<n>We implement state-of-the-art PTQ methods and conduct a comprehensive evaluation.
arXiv Detail & Related papers (2025-08-20T17:59:51Z) - Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey [69.45421620616486]
This work presents the first structured taxonomy and analysis of discrete tokenization methods designed for large language models (LLMs)<n>We categorize 8 representative VQ variants that span classical and modern paradigms and analyze their algorithmic principles, training dynamics, and integration challenges with LLM pipelines.<n>We identify key challenges including codebook collapse, unstable gradient estimation, and modality-specific encoding constraints.
arXiv Detail & Related papers (2025-07-21T10:52:14Z) - Graft: Integrating the Domain Knowledge via Efficient Parameter Synergy for MLLMs [56.76586846269894]
Multimodal Large Language Models (MLLMs) have achieved success across various domains.<n>Despite its importance, the study of knowledge sharing among domain-specific MLLMs remains largely underexplored.<n>We propose a unified parameter integration framework that enables modular composition of expert capabilities.
arXiv Detail & Related papers (2025-06-30T15:07:41Z) - Evaluating Large Language Models for Real-World Engineering Tasks [75.97299249823972]
This paper introduces a curated database comprising over 100 questions derived from authentic, production-oriented engineering scenarios.<n>Using this dataset, we evaluate four state-of-the-art Large Language Models (LLMs)<n>Our results show that LLMs demonstrate strengths in basic temporal and structural reasoning but struggle significantly with abstract reasoning, formal modeling, and context-sensitive engineering logic.
arXiv Detail & Related papers (2025-05-12T14:05:23Z) - LLMs4Life: Large Language Models for Ontology Learning in Life Sciences [10.658387847149195]
Existing Large Language Models (LLMs) struggle to generate with multiple hierarchical levels, rich interconnections, and comprehensive coverage.<n>We extend the NeOn-GPT for ontology learning using LLMs with advanced prompt engineering techniques.<n>Our evaluation shows the viability of LLMs for learning in specialized domains, providing solutions to longstanding limitations in model performance and scalability.
arXiv Detail & Related papers (2024-12-02T23:31:52Z) - DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph [70.79413606968814]
We introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity.
Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data.
Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.
arXiv Detail & Related papers (2024-06-25T04:27:53Z) - Towards Complex Ontology Alignment using Large Language Models [1.3218260503808055]
Ontology alignment is a critical process in Web for detecting relationships between different labels and content.
Recent advancements in Large Language Models (LLMs) presents new opportunities for enhancing engineering practices.
This paper investigates the application of LLM technologies to tackle the complex alignment challenge.
arXiv Detail & Related papers (2024-04-16T07:13:22Z) - Large language models as oracles for instantiating ontologies with domain-specific knowledge [0.0]
We propose a domain-independent approach to automatically instantiate with domain-specific knowledge.<n>Our method queries the multiple times, and generates instances for classes and properties from its replies.<n> Experimentally, our method achieves that is up to five times higher than the state-of-the-art.
arXiv Detail & Related papers (2024-04-05T14:04:07Z) - Characterization of Large Language Model Development in the Datacenter [55.9909258342639]
Large Language Models (LLMs) have presented impressive performance across several transformative tasks.
However, it is non-trivial to efficiently utilize large-scale cluster resources to develop LLMs.
We present an in-depth characterization study of a six-month LLM development workload trace collected from our GPU datacenter Acme.
arXiv Detail & Related papers (2024-03-12T13:31:14Z) - Dividing the Ontology Alignment Task with Semantic Embeddings and
Logic-based Modules [15.904000789557486]
This paper presents an approach that combines a embedding model and logic-based modules to accurately divide an input matching task into smaller and more tractable tasks.
The results are encouraging and suggest that the proposed method is adequate in practice and can be integrated within the workflow of systems unable to cope with very large neural datasets.
arXiv Detail & Related papers (2020-02-25T14:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.