Do LLMs Dream of Ontologies?
- URL: http://arxiv.org/abs/2401.14931v2
- Date: Sun, 02 Feb 2025 13:51:02 GMT
- Title: Do LLMs Dream of Ontologies?
- Authors: Marco Bombieri, Paolo Fiorini, Simone Paolo Ponzetto, Marco Rospocher,
- Abstract summary: Large Models Language (LLMs) have demonstrated remarkable memorization across diverse natural language processing tasks.
This paper investigates the extent to which general-purpose LLMs correctly reproduce concept identifier (ID)-label associations from publicly available resources.
- Score: 13.776194387957617
- License:
- Abstract: Large Language Models (LLMs) have demonstrated remarkable performance across diverse natural language processing tasks, yet their ability to memorize structured knowledge remains underexplored. In this paper, we investigate the extent to which general-purpose pre-trained LLMs retain and correctly reproduce concept identifier (ID)-label associations from publicly available ontologies. We conduct a systematic evaluation across multiple ontological resources, including the Gene Ontology, Uberon, Wikidata, and ICD-10, using LLMs such as Pythia-12B, Gemini-1.5-Flash, GPT-3.5, and GPT-4. Our findings reveal that only a small fraction of ontological concepts is accurately memorized, with GPT-4 demonstrating the highest performance. To understand why certain concepts are memorized more effectively than others, we analyze the relationship between memorization accuracy and concept popularity on the Web. Our results indicate a strong correlation between the frequency of a concept's occurrence online and the likelihood of accurately retrieving its ID from the label. This suggests that LLMs primarily acquire such knowledge through indirect textual exposure rather than directly from structured ontological resources. Furthermore, we introduce new metrics to quantify prediction invariance, demonstrating that the stability of model responses across variations in prompt language and temperature settings can serve as a proxy for estimating memorization robustness.
Related papers
- Analyzing Memorization in Large Language Models through the Lens of Model Attribution [11.295483963637217]
Large Language Models (LLMs) are prevalent in modern applications but often memorize training data, leading to privacy breaches and copyright issues.
We investigate memorization from an architectural lens by analyzing how attention modules at different layers impact its memorization and generalization.
arXiv Detail & Related papers (2025-01-09T09:00:32Z) - Detecting Memorization in Large Language Models [0.0]
Large language models (LLMs) have achieved impressive results in natural language processing but are prone to memorizing portions of their training data.
Traditional methods for detecting memorization rely on output probabilities or loss functions.
We introduce an analytical method that precisely detects memorization by examining neuron activations within the LLM.
arXiv Detail & Related papers (2024-12-02T00:17:43Z) - Undesirable Memorization in Large Language Models: A Survey [5.659933808910005]
We present a Systematization of Knowledge (SoK) on the topic of memorization in Large Language Models (LLMs)
Memorization is the effect that a model tends to store and reproduce phrases or passages from the training data.
We discuss the metrics and methods used to measure memorization, followed by an analysis of the factors that contribute to memorization phenomenon.
arXiv Detail & Related papers (2024-10-03T16:34:46Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition [78.97487780589574]
Multimodal Large Language Models (MLLMs) excel at classifying fine-grained categories.
This paper introduces a Retrieving And Ranking augmented method for MLLMs.
Our proposed approach not only addresses the inherent limitations in fine-grained recognition but also preserves the model's comprehensive knowledge base.
arXiv Detail & Related papers (2024-03-20T17:59:55Z) - FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - EpiK-Eval: Evaluation for Language Models as Epistemic Models [16.485951373967502]
We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives.
We argue that these shortcomings stem from the intrinsic nature of prevailing training objectives.
The findings from this study offer insights for developing more robust and reliable LLMs.
arXiv Detail & Related papers (2023-10-23T21:15:54Z) - Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks.
Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks.
We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z) - Quantifying and Analyzing Entity-level Memorization in Large Language
Models [4.59914731734176]
Large language models (LLMs) have been proven capable of memorizing their training data.
Privacy risks arising from memorization have attracted increasing attention.
We propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios.
arXiv Detail & Related papers (2023-08-30T03:06:47Z) - Knowledge-Augmented Reasoning Distillation for Small Language Models in
Knowledge-Intensive Tasks [90.11273439036455]
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks.
We propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales from LLMs with augmented knowledge retrieved from an external knowledge base.
We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets.
arXiv Detail & Related papers (2023-05-28T13:00:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.