The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation
- URL: http://arxiv.org/abs/2507.05578v1
- Date: Tue, 08 Jul 2025 01:30:46 GMT
- Title: The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation
- Authors: Alexander Xiong, Xuandong Zhao, Aneesh Pappu, Dawn Song,
- Abstract summary: Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data.<n>This paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation.
- Score: 97.0658685969199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data. This phenomenon raises critical questions about model behavior, privacy risks, and the boundary between learning and memorization. Addressing these concerns, this paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation. We explore key drivers, including training data duplication, training dynamics, and fine-tuning procedures that influence data memorization. In addition, we examine methodologies such as prefix-based extraction, membership inference, and adversarial prompting, assessing their effectiveness in detecting and measuring memorized content. Beyond technical analysis, we also explore the broader implications of memorization, including the legal and ethical implications. Finally, we discuss mitigation strategies, including data cleaning, differential privacy, and post-training unlearning, while highlighting open challenges in balancing the minimization of harmful memorization with utility. This paper provides a comprehensive overview of the current state of research on LLM memorization across technical, privacy, and performance dimensions, identifying critical directions for future work.
Related papers
- SoK: Machine Unlearning for Large Language Models [14.88062383081161]
Large language model (LLM) unlearning has become a critical topic in machine learning.<n>We propose a new taxonomy based on the intention of unlearning.
arXiv Detail & Related papers (2025-06-10T20:30:39Z) - Mitigating Memorization in LLMs using Activation Steering [3.5782765808288475]
memorization of training data by Large Language Models (LLMs) poses significant risks, including privacy leaks and the regurgitation of copyrighted content.<n> Activation steering, a technique that directly intervenes in model activations, has emerged as a promising approach for manipulating LLMs.
arXiv Detail & Related papers (2025-03-08T03:37:07Z) - Exploring Knowledge Boundaries in Large Language Models for Retrieval Judgment [56.87031484108484]
Large Language Models (LLMs) are increasingly recognized for their practical applications.
Retrieval-Augmented Generation (RAG) tackles this challenge and has shown a significant impact on LLMs.
By minimizing retrieval requests that yield neutral or harmful results, we can effectively reduce both time and computational costs.
arXiv Detail & Related papers (2024-11-09T15:12:28Z) - Undesirable Memorization in Large Language Models: A Survey [5.659933808910005]
memorization refers to a model's tendency to store and reproduce phrases from its training data.<n>This paper provides a taxonomy of the literature on LLM memorization, exploring it across three dimensions: granularity, retrievability, and desirability.<n>We conclude our survey by identifying potential research topics for the near future, including methods to balance privacy and performance.
arXiv Detail & Related papers (2024-10-03T16:34:46Z) - Extracting Training Data from Document-Based VQA Models [67.1470112451617]
Vision-Language Models (VLMs) have made remarkable progress in document-based Visual Question Answering (i.e., responding to queries about the contents of an input document provided as an image)
We show these models can memorise responses for training samples and regurgitate them even when the relevant visual information has been removed.
This includes Personal Identifiable Information repeated once in the training set, indicating these models could divulge sensitive information and therefore pose a privacy risk.
arXiv Detail & Related papers (2024-07-11T17:44:41Z) - Recent Advances in Federated Learning Driven Large Language Models: A Survey on Architecture, Performance, and Security [24.969739515876515]
Federated Learning (FL) offers a promising paradigm for training Large Language Models (LLMs) in a decentralized manner while preserving data privacy and minimizing communication overhead.<n>We review a range of strategies enabling unlearning in federated LLMs, including perturbation-based methods, model decomposition, and incremental retraining.<n>This survey identifies critical research directions toward developing secure, adaptable, and high-performing federated LLM systems for real-world deployment.
arXiv Detail & Related papers (2024-06-14T08:40:58Z) - Memorization in deep learning: A survey [26.702878179026754]
Recent investigations have uncovered an interesting phenomenon in which Deep Neural Networks (DNNs) tend to memorize specific details from examples rather than learning general patterns.
This raises critical questions about the nature of generalization in DNNs and their susceptibility to security breaches.
We present a systematic framework to organize memorization definitions based on the generalization and security/privacy domains.
arXiv Detail & Related papers (2024-06-06T09:17:40Z) - The Frontier of Data Erasure: Machine Unlearning for Large Language Models [56.26002631481726]
Large Language Models (LLMs) are foundational to AI advancements.
LLMs pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted information.
Machine unlearning emerges as a cutting-edge solution to mitigate these concerns.
arXiv Detail & Related papers (2024-03-23T09:26:15Z) - Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks.
Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks.
We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z) - Identifying and Mitigating Privacy Risks Stemming from Language Models: A Survey [43.063650238194384]
Large Language Models (LLMs) have shown greatly enhanced performance in recent years, attributed to increased size and extensive training data.
Training data memorization in Machine Learning models scales with model size, particularly concerning for LLMs.
Memorized text sequences have the potential to be directly leaked from LLMs, posing a serious threat to data privacy.
arXiv Detail & Related papers (2023-09-27T15:15:23Z) - Machine Unlearning: Solutions and Challenges [21.141664917477257]
Machine learning models may inadvertently memorize sensitive, unauthorized, or malicious data, posing risks of privacy breaches, security vulnerabilities, and performance degradation.
To address these issues, machine unlearning has emerged as a critical technique to selectively remove specific training data points' influence on trained models.
This paper provides a comprehensive taxonomy and analysis of the solutions in machine unlearning.
arXiv Detail & Related papers (2023-08-14T10:45:51Z) - Semantics-Preserved Distortion for Personal Privacy Protection in Information Management [65.08939490413037]
This paper suggests a linguistically-grounded approach to distort texts while maintaining semantic integrity.
We present two distinct frameworks for semantic-preserving distortion: a generative approach and a substitutive approach.
We also explore privacy protection in a specific medical information management scenario, showing our method effectively limits sensitive data memorization.
arXiv Detail & Related papers (2022-01-04T04:01:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.