Memorization in Fine-Tuned Large Language Models
- URL: http://arxiv.org/abs/2507.21009v2
- Date: Tue, 05 Aug 2025 08:32:40 GMT
- Title: Memorization in Fine-Tuned Large Language Models
- Authors: Danil Savine,
- Abstract summary: This study investigates the mechanisms and factors influencing memorization in fine-tuned large language models (LLMs)<n>We examine how different aspects of the fine-tuning process affect a model's propensity to memorize training data, using the PHEE dataset of pharmacovigilance events.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study investigates the mechanisms and factors influencing memorization in fine-tuned large language models (LLMs), with a focus on the medical domain due to its privacy-sensitive nature. We examine how different aspects of the fine-tuning process affect a model's propensity to memorize training data, using the PHEE dataset of pharmacovigilance events. Our research employs two main approaches: a membership inference attack to detect memorized data, and a generation task with prompted prefixes to assess verbatim reproduction. We analyze the impact of adapting different weight matrices in the transformer architecture, the relationship between perplexity and memorization, and the effect of increasing the rank in low-rank adaptation (LoRA) fine-tuning. Key findings include: (1) Value and Output matrices contribute more significantly to memorization compared to Query and Key matrices; (2) Lower perplexity in the fine-tuned model correlates with increased memorization; (3) Higher LoRA ranks lead to increased memorization, but with diminishing returns at higher ranks. These results provide insights into the trade-offs between model performance and privacy risks in fine-tuned LLMs. Our findings have implications for developing more effective and responsible strategies for adapting large language models while managing data privacy concerns.
Related papers
- Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z) - Extending Memorization Dynamics in Pythia Models from Instance-Level Insights [8.476099189609565]
This paper presents a detailed analysis of memorization in the Pythia model family across varying scales and training steps.<n>Using granular metrics, we examine how model architecture, data characteristics, and perturbations influence memorization patterns.
arXiv Detail & Related papers (2025-06-14T03:02:42Z) - Memorization or Interpolation ? Detecting LLM Memorization through Input Perturbation Analysis [8.725781605542675]
Large Language Models (LLMs) achieve remarkable performance through training on massive datasets.<n>LLMs can exhibit concerning behaviors such as verbatim reproduction of training data rather than true generalization.<n>This paper introduces PEARL, a novel approach for detecting memorization in LLMs.
arXiv Detail & Related papers (2025-05-05T20:42:34Z) - Exploring Training and Inference Scaling Laws in Generative Retrieval [50.82554729023865]
Generative retrieval reformulates retrieval as an autoregressive generation task, where large language models generate target documents directly from a query.<n>We systematically investigate training and inference scaling laws in generative retrieval, exploring how model size, training data scale, and inference-time compute jointly influence performance.
arXiv Detail & Related papers (2025-03-24T17:59:03Z) - Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search [59.75749613951193]
We propose Data Influence-oriented Tree Search (DITS) to guide both tree search and data selection.<n>By leveraging influence scores, we effectively identify the most impactful data for system improvement.<n>We derive influence score estimation methods tailored for non-differentiable metrics.
arXiv Detail & Related papers (2025-02-02T23:20:16Z) - On Adversarial Robustness of Language Models in Transfer Learning [13.363850350446869]
We show that transfer learning, while improving standard performance metrics, often leads to increased vulnerability to adversarial attacks.<n>Our findings demonstrate that larger models exhibit greater resilience to this phenomenon, suggesting a complex interplay between model size, architecture, and adaptation methods.
arXiv Detail & Related papers (2024-12-29T15:55:35Z) - Do Influence Functions Work on Large Language Models? [10.463762448166714]
Influence functions are important for quantifying the impact of individual training data points on a model's predictions.<n>We evaluate influence functions across multiple tasks and find that they consistently perform poorly in most settings.
arXiv Detail & Related papers (2024-09-30T06:50:18Z) - Understanding Memorisation in LLMs: Dynamics, Influencing Factors, and Implications [14.818820873377303]
We study whether and to what extent large language models (LLMs) have memorised training data.
We create an experimental framework that is based on repeatedly exposing LLMs to random strings.
We identify factors that make some strings easier to memorise than others, and we identify the role of local prefixes and global context in memorisation.
arXiv Detail & Related papers (2024-07-27T14:00:21Z) - Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency.<n>This study demonstrates that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.