Related papers: Do LLMs Dream of Ontologies?

Related papers

Analyzing Memorization in Large Language Models through the Lens of Model Attribution [11.295483963637217]
Large Language Models (LLMs) are prevalent in modern applications but often memorize training data, leading to privacy breaches and copyright issues. We investigate memorization from an architectural lens by analyzing how attention modules at different layers impact its memorization and generalization.
arXiv Detail & Related papers (2025-01-09T09:00:32Z)
Detecting Memorization in Large Language Models [0.0]
Large language models (LLMs) have achieved impressive results in natural language processing but are prone to memorizing portions of their training data. Traditional methods for detecting memorization rely on output probabilities or loss functions. We introduce an analytical method that precisely detects memorization by examining neuron activations within the LLM.
arXiv Detail & Related papers (2024-12-02T00:17:43Z)
Understanding Ranking LLMs: A Mechanistic Analysis for Information Retrieval [20.353393773305672]
We employ a probing-based analysis to examine neuron activations in ranking LLMs. Our study spans a broad range of feature categories, including lexical signals, document structure, query-document interactions, and complex semantic representations. Our findings offer crucial insights for developing more transparent and reliable retrieval systems.
arXiv Detail & Related papers (2024-10-24T08:20:10Z)
Undesirable Memorization in Large Language Models: A Survey [5.659933808910005]
We present a Systematization of Knowledge (SoK) on the topic of memorization in Large Language Models (LLMs) Memorization is the effect that a model tends to store and reproduce phrases or passages from the training data. We discuss the metrics and methods used to measure memorization, followed by an analysis of the factors that contribute to memorization phenomenon.
arXiv Detail & Related papers (2024-10-03T16:34:46Z)
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency. This study demonstrates that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z)
Scaling Laws for Fact Memorization of Large Language Models [67.94080978627363]
We analyze the scaling laws for Large Language Models' fact knowledge and their behaviors of memorizing different types of facts. We find that LLMs' fact knowledge capacity has a linear and negative exponential law relationship with model size and training epochs. Our findings reveal the capacity and characteristics of LLMs' fact knowledge learning, which provide directions for LLMs' fact knowledge augmentation.
arXiv Detail & Related papers (2024-06-22T03:32:09Z)
Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning. This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z)
A Multi-Perspective Analysis of Memorization in Large Language Models [10.276594755936529]
Large Language Models (LLMs) show unprecedented performance in various fields. LLMs can generate the same content used to train them. This research comprehensively discussed memorization from various perspectives.
arXiv Detail & Related papers (2024-05-19T15:00:50Z)
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition [78.97487780589574]
Multimodal Large Language Models (MLLMs) excel at classifying fine-grained categories. This paper introduces a Retrieving And Ranking augmented method for MLLMs. Our proposed approach not only addresses the inherent limitations in fine-grained recognition but also preserves the model's comprehensive knowledge base.
arXiv Detail & Related papers (2024-03-20T17:59:55Z)
Small Models are LLM Knowledge Triggers on Medical Tabular Prediction [39.78560996984352]
We propose SERSAL, a general self-prompting method by synergy learning with small models. We show that SERSAL attains substantial improvement compared to linguistic prompting methods.
arXiv Detail & Related papers (2024-03-03T17:35:52Z)
FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks. We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z)
Rethinking Interpretability in the Era of Large Language Models [76.1947554386879]
Large language models (LLMs) have demonstrated remarkable capabilities across a wide array of tasks. The capability to explain in natural language allows LLMs to expand the scale and complexity of patterns that can be given to a human. These new capabilities raise new challenges, such as hallucinated explanations and immense computational costs.
arXiv Detail & Related papers (2024-01-30T17:38:54Z)
Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering. The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored. We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
Enabling Large Language Models to Learn from Rules [99.16680531261987]
We are inspired that humans can learn the new tasks or knowledge in another way by learning from rules. We propose rule distillation, which first uses the strong in-context abilities of LLMs to extract the knowledge from the textual rules. Our experiments show that making LLMs learn from rules by our method is much more efficient than example-based learning in both the sample size and generalization ability.
arXiv Detail & Related papers (2023-11-15T11:42:41Z)
SoK: Memorization in General-Purpose Large Language Models [25.448127387943053]
Large Language Models (LLMs) are advancing at a remarkable pace, with myriad applications under development. LLMs can memorize short secrets in the training data, but can also memorize concepts like facts or writing styles that can be expressed in text in many different ways. We propose a taxonomy for memorization in LLMs that covers verbatim text, facts, ideas and algorithms, writing styles, distributional properties, and alignment goals.
arXiv Detail & Related papers (2023-10-24T14:25:53Z)
Improving Language Models Meaning Understanding and Consistency by Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness. A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results. We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z)
EpiK-Eval: Evaluation for Language Models as Epistemic Models [16.485951373967502]
We introduce EpiK-Eval, a novel question-answering benchmark tailored to evaluate LLMs' proficiency in formulating a coherent and consistent knowledge representation from segmented narratives. We argue that these shortcomings stem from the intrinsic nature of prevailing training objectives. The findings from this study offer insights for developing more robust and reliable LLMs.
arXiv Detail & Related papers (2023-10-23T21:15:54Z)
Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks. We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z)
Quantifying and Analyzing Entity-level Memorization in Large Language Models [4.59914731734176]
Large language models (LLMs) have been proven capable of memorizing their training data. Privacy risks arising from memorization have attracted increasing attention. We propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios.
arXiv Detail & Related papers (2023-08-30T03:06:47Z)
Give Us the Facts: Enhancing Large Language Models with Knowledge Graphs for Fact-aware Language Modeling [34.59678835272862]
ChatGPT, a representative large language model (LLM), has gained considerable attention due to its powerful emergent abilities. This paper proposes to enhance LLMs with knowledge graph-enhanced large language models (KGLLMs) KGLLM provides a solution to enhance LLMs' factual reasoning ability, opening up new avenues for LLM research.
arXiv Detail & Related papers (2023-06-20T12:21:06Z)
Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks [90.11273439036455]
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks. We propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales from LLMs with augmented knowledge retrieved from an external knowledge base. We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets.
arXiv Detail & Related papers (2023-05-28T13:00:00Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.