Meronymic Ontology Extraction via Large Language Models
- URL: http://arxiv.org/abs/2510.13839v2
- Date: Sun, 09 Nov 2025 22:44:44 GMT
- Title: Meronymic Ontology Extraction via Large Language Models
- Authors: Dekai Zhang, Simone Conia, Antonio Rago,
- Abstract summary: Onologies have become essential in today's digital age as a way of organising the vast amount of readily available unstructured text.<n>In this paper, we develop a fully-automated method of extracting meronymies from texts.
- Score: 16.08771514313186
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ontologies have become essential in today's digital age as a way of organising the vast amount of readily available unstructured text. In providing formal structure to this information, ontologies have immense value and application across various domains, e.g., e-commerce, where countless product listings necessitate proper product organisation. However, the manual construction of these ontologies is a time-consuming, expensive and laborious process. In this paper, we harness the recent advancements in large language models (LLMs) to develop a fully-automated method of extracting product ontologies, in the form of meronymies, from raw review texts. We demonstrate that the ontologies produced by our method surpass an existing, BERT-based baseline when evaluating using an LLM-as-a-judge. Our investigation provides the groundwork for LLMs to be used more generally in (product or otherwise) ontology extraction.
Related papers
- From Prompt to Graph: Comparing LLM-Based Information Extraction Strategies in Domain-Specific Ontology Development [14.475791894420666]
Ontologies are essential for structuring domain knowledge, improving accessibility, sharing, and reuse.<n>Traditional ontologies rely on manual annotation and conventional natural language processing (NLP) techniques.<n>The rise of Large Language Models (LLMs) offers new possibilities for automating knowledge extraction.<n>This study investigates three LLM-based approaches, including pre-trained LLM-driven method, in-context learning (ICL) method and fine-tuning method to extract terms and relations from domain-specific texts.
arXiv Detail & Related papers (2026-01-31T12:50:23Z) - Scaling Beyond Context: A Survey of Multimodal Retrieval-Augmented Generation for Document Understanding [61.36285696607487]
Document understanding is critical for applications from financial analysis to scientific discovery.<n>Current approaches, whether OCR-based pipelines feeding Large Language Models (LLMs) or native Multimodal LLMs (MLLMs) face key limitations.<n>Retrieval-Augmented Generation (RAG) helps ground models in external data, but documents' multimodal nature, combining text, tables, charts, and layout, demands a more advanced paradigm: Multimodal RAG.
arXiv Detail & Related papers (2025-10-17T02:33:16Z) - LLM-based Triplet Extraction for Automated Ontology Generation in Software Engineering Standards [0.0]
Software engineering standards (SES) consist of long, unstructured text (with high noise) and paragraphs with domain-specific terms.<n>This work proposes an open-source large language model (LLM)-assisted approach to RTE for SES.
arXiv Detail & Related papers (2025-08-29T17:14:54Z) - Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction [80.88654868264645]
Arranged and Organized Extraction Benchmark designed to evaluate ability of large language models to comprehend fragmented documents.<n>AOE includes 11 carefully crafted tasks across three diverse domains, requiring models to generate context-specific schema tailored to varied input queries.<n>Results show that even the most advanced models struggled significantly.
arXiv Detail & Related papers (2025-07-22T06:37:51Z) - Evaluating Large Language Models for Real-World Engineering Tasks [75.97299249823972]
This paper introduces a curated database comprising over 100 questions derived from authentic, production-oriented engineering scenarios.<n>Using this dataset, we evaluate four state-of-the-art Large Language Models (LLMs)<n>Our results show that LLMs demonstrate strengths in basic temporal and structural reasoning but struggle significantly with abstract reasoning, formal modeling, and context-sensitive engineering logic.
arXiv Detail & Related papers (2025-05-12T14:05:23Z) - End-to-End Ontology Learning with Large Language Models [11.755755139228219]
Large language models (LLMs) have been applied to solve various subtasks of ontology learning.
We address this gap by OLLM, a general and scalable method for building the taxonomic backbone of an ontology from scratch.
In contrast to standard metrics, our metrics use deep learning techniques to define more robust structural distance measures between graphs.
Our model can be effectively adapted to new domains, like arXiv, needing only a small number of training examples.
arXiv Detail & Related papers (2024-10-31T02:52:39Z) - Automating Intervention Discovery from Scientific Literature: A Progressive Ontology Prompting and Dual-LLM Framework [56.858564736806414]
This paper proposes a novel framework leveraging large language models (LLMs) to identify interventions in scientific literature.<n>Our approach successfully identified 2,421 interventions from a corpus of 64,177 research articles in the speech-language pathology domain.
arXiv Detail & Related papers (2024-08-20T16:42:23Z) - Large language models as oracles for instantiating ontologies with domain-specific knowledge [0.0]
We propose a domain-independent approach to automatically instantiate with domain-specific knowledge.<n>Our method queries the multiple times, and generates instances for classes and properties from its replies.<n> Experimentally, our method achieves that is up to five times higher than the state-of-the-art.
arXiv Detail & Related papers (2024-04-05T14:04:07Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Large Language Models for Generative Information Extraction: A Survey [89.71273968283616]
Large Language Models (LLMs) have demonstrated remarkable capabilities in text understanding and generation.
We present an extensive overview by categorizing these works in terms of various IE subtasks and techniques.
We empirically analyze the most advanced methods and discover the emerging trend of IE tasks with LLMs.
arXiv Detail & Related papers (2023-12-29T14:25:22Z) - Automatic Product Ontology Extraction from Textual Reviews [12.235907063179278]
We show that the generated by our method outperform hand-crafted (NetWord) and extracted by existing methods (Text2Onto and COMET) in several, diverse settings.
Our method is better able to determine recommended products based on their reviews, in alternative to using Amazon's standard score aggregations.
arXiv Detail & Related papers (2021-05-23T16:06:38Z) - Progressive Generation of Long Text with Pretrained Language Models [83.62523163717448]
Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators.
It is still challenging for such models to generate coherent long passages of text, especially when the models are fine-tuned to the target domain on a small corpus.
We propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution.
arXiv Detail & Related papers (2020-06-28T21:23:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.