Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization
- URL: http://arxiv.org/abs/2412.18497v1
- Date: Tue, 24 Dec 2024 15:28:56 GMT
- Title: Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization
- Authors: Yi-Fu Fu, Yu-Chieh Tu, Tzu-Ling Cheng, Cheng-Yu Lin, Yi-Ting Yang, Heng-Yi Liu, Keng-Te Liao, Da-Cheng Juan, Shou-De Lin,
- Abstract summary: We explore the foundational mechanisms of memorization and generalization in Large Language Models (LLMs)
Our investigation serves as a case study leveraging specially designed datasets and experimental-scale LLMs.
- Score: 10.11351265687309
- License:
- Abstract: In this paper, we explore the foundational mechanisms of memorization and generalization in Large Language Models (LLMs), inspired by the functional specialization observed in the human brain. Our investigation serves as a case study leveraging specially designed datasets and experimental-scale LLMs to lay the groundwork for understanding these behaviors. Specifically, we aim to first enable LLMs to exhibit both memorization and generalization by training with the designed dataset, then (a) examine whether LLMs exhibit neuron-level spatial differentiation for memorization and generalization, (b) predict these behaviors using model internal representations, and (c) steer the behaviors through inference-time interventions. Our findings reveal that neuron-wise differentiation of memorization and generalization is observable in LLMs, and targeted interventions can successfully direct their behavior.
Related papers
- Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search [57.28671084993782]
Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains.
Recent studies have shown that increasing test-time computation enhances LLMs' reasoning capabilities.
We propose a two-stage training paradigm: 1) a small-scale format tuning stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning.
arXiv Detail & Related papers (2025-02-04T17:26:58Z) - Analyzing Memorization in Large Language Models through the Lens of Model Attribution [11.295483963637217]
Large Language Models (LLMs) are prevalent in modern applications but often memorize training data, leading to privacy breaches and copyright issues.
We investigate memorization from an architectural lens by analyzing how attention modules at different layers impact its memorization and generalization.
arXiv Detail & Related papers (2025-01-09T09:00:32Z) - Detecting Memorization in Large Language Models [0.0]
Large language models (LLMs) have achieved impressive results in natural language processing but are prone to memorizing portions of their training data.
Traditional methods for detecting memorization rely on output probabilities or loss functions.
We introduce an analytical method that precisely detects memorization by examining neuron activations within the LLM.
arXiv Detail & Related papers (2024-12-02T00:17:43Z) - Mind Scramble: Unveiling Large Language Model Psychology Via Typoglycemia [27.650551131885152]
Research into large language models (LLMs) has shown promise in addressing complex tasks in the physical world.
Studies suggest that powerful LLMs, like GPT-4, are beginning to exhibit human-like cognitive abilities.
arXiv Detail & Related papers (2024-10-02T15:47:25Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications [14.818820873377303]
We study whether and to what extent large language models (LLMs) have memorised training data.
We create an experimental framework that is based on repeatedly exposing LLMs to random strings.
We identify factors that make some strings easier to memorise than others, and we identify the role of local prefixes and global context in memorisation.
arXiv Detail & Related papers (2024-07-27T14:00:21Z) - Characterizing Truthfulness in Large Language Model Generations with
Local Intrinsic Dimension [63.330262740414646]
We study how to characterize and predict the truthfulness of texts generated from large language models (LLMs)
We suggest investigating internal activations and quantifying LLM's truthfulness using the local intrinsic dimension (LID) of model activations.
arXiv Detail & Related papers (2024-02-28T04:56:21Z) - Exploring the Cognitive Knowledge Structure of Large Language Models: An
Educational Diagnostic Assessment Approach [50.125704610228254]
Large Language Models (LLMs) have not only exhibited exceptional performance across various tasks, but also demonstrated sparks of intelligence.
Recent studies have focused on assessing their capabilities on human exams and revealed their impressive competence in different domains.
We conduct an evaluation using MoocRadar, a meticulously annotated human test dataset based on Bloom taxonomy.
arXiv Detail & Related papers (2023-10-12T09:55:45Z) - Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks.
Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks.
We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z) - An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning [70.48605869773814]
Catastrophic forgetting (CF) is a phenomenon that occurs in machine learning when a model forgets previously learned information.
This study empirically evaluates the forgetting phenomenon in large language models during continual instruction tuning.
arXiv Detail & Related papers (2023-08-17T02:53:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.