Related papers: Skewed Memorization in Large Language Models: Quantification and Decomposition

Related papers

Characterizing Memorization in Diffusion Language Models: Generalized Extraction and Sampling Effects [17.220195638215507]
Diffusion language models (DLMs) have emerged as a competitive alternative to autoregressive language models (ARMs)<n>DLMs exhibit substantially lower memorization-based leakage of personally identifiable information (PII) compared to ARMs.
arXiv Detail & Related papers (2026-03-02T19:03:32Z)
How and Why LLMs Generalize: A Fine-Grained Analysis of LLM Reasoning from Cognitive Behaviors to Low-Level Patterns [51.02752099869218]
Large Language Models (LLMs) display strikingly different generalization behaviors.<n>We introduce a novel benchmark that decomposes reasoning into atomic core skills.<n>We show that RL-tuned models maintain more stable behavioral profiles and resist collapse in reasoning skills, whereas SFT models exhibit sharper drift and overfit to surface patterns.
arXiv Detail & Related papers (2025-12-30T08:16:20Z)
Retracing the Past: LLMs Emit Training Data When They Get Lost [18.852558767604823]
memorization of training data in large language models poses significant privacy and copyright concerns.<n>This paper introduces Confusion-Inducing Attacks (CIA), a principled framework for extracting memorized data.
arXiv Detail & Related papers (2025-10-27T03:48:24Z)
The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation [97.0658685969199]
Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data.<n>This paper synthesizes recent studies and investigates the landscape of memorization, the factors influencing it, and methods for its detection and mitigation.
arXiv Detail & Related papers (2025-07-08T01:30:46Z)
Low-Perplexity LLM-Generated Sequences and Where To Find Them [0.0]
We introduce a systematic approach centered on analyzing low-perplexity sequences - high-probability text spans generated by the model.<n>Our pipeline reliably extracts such long sequences across diverse topics while avoiding degeneration, then traces them back to their sources in the training data.<n>For those that do match, we quantify the distribution of occurrences across source documents, highlighting the scope and nature of verbatim recall.
arXiv Detail & Related papers (2025-07-02T15:58:51Z)
Memorization or Interpolation ? Detecting LLM Memorization through Input Perturbation Analysis [8.725781605542675]
Large Language Models (LLMs) achieve remarkable performance through training on massive datasets.<n>LLMs can exhibit concerning behaviors such as verbatim reproduction of training data rather than true generalization.<n>This paper introduces PEARL, a novel approach for detecting memorization in LLMs.
arXiv Detail & Related papers (2025-05-05T20:42:34Z)
Detecting Memorization in Large Language Models [0.0]
Large language models (LLMs) have achieved impressive results in natural language processing but are prone to memorizing portions of their training data.<n>Traditional methods for detecting memorization rely on output probabilities or loss functions.<n>We introduce an analytical method that precisely detects memorization by examining neuron activations within the LLM.
arXiv Detail & Related papers (2024-12-02T00:17:43Z)
A Geometric Framework for Understanding Memorization in Generative Models [11.263296715798374]
Recent work has shown that deep generative models can be capable of memorizing and reproducing training datapoints when deployed. These findings call into question the usability of generative models, especially in light of the legal and privacy risks brought about by memorization. We propose the manifold memorization hypothesis (MMH), a geometric framework which leverages the manifold hypothesis into a clear language in which to reason about memorization.
arXiv Detail & Related papers (2024-10-31T18:09:01Z)
Measuring memorization through probabilistic discoverable extraction [19.4511858341881]
Large language models (LLMs) are susceptible to memorizing training data. Current methods to measure memorization rates of LLMs rely on single-sequence greedy sampling.
arXiv Detail & Related papers (2024-10-25T11:37:04Z)
Undesirable Memorization in Large Language Models: A Survey [5.659933808910005]
We present a Systematization of Knowledge (SoK) on the topic of memorization in Large Language Models (LLMs) Memorization is the effect that a model tends to store and reproduce phrases or passages from the training data. We discuss the metrics and methods used to measure memorization, followed by an analysis of the factors that contribute to memorization phenomenon.
arXiv Detail & Related papers (2024-10-03T16:34:46Z)
Demystifying Verbatim Memorization in Large Language Models [67.49068128909349]
Large Language Models (LLMs) frequently memorize long sequences verbatim, often with serious legal and privacy implications. We develop a framework to study verbatim memorization in a controlled setting by continuing pre-training from Pythia checkpoints with injected sequences. We find that (1) non-trivial amounts of repetition are necessary for verbatim memorization to happen; (2) later (and presumably better) checkpoints are more likely to memorize verbatim sequences, even for out-of-distribution sequences.
arXiv Detail & Related papers (2024-07-25T07:10:31Z)
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency.<n>We show that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z)
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoods [56.073335779595475]
We propose ReCaLL (Relative Conditional Log-Likelihood), a novel membership inference attack (MIA) ReCaLL examines the relative change in conditional log-likelihoods when prefixing target data points with non-member context. We conduct comprehensive experiments and show that ReCaLL achieves state-of-the-art performance on the WikiMIA dataset.
arXiv Detail & Related papers (2024-06-23T00:23:13Z)
Cycles of Thought: Measuring LLM Confidence through Stable Explanations [53.15438489398938]
Large language models (LLMs) can reach and even surpass human-level accuracy on a variety of benchmarks, but their overconfidence in incorrect responses is still a well-documented failure mode. We propose a framework for measuring an LLM's uncertainty with respect to the distribution of generated explanations for an answer.
arXiv Detail & Related papers (2024-06-05T16:35:30Z)
Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning [61.2224355547598]
Open-sourcing of large language models (LLMs) accelerates application development, innovation, and scientific progress. Our investigation exposes a critical oversight in this belief. By deploying carefully designed demonstrations, our research demonstrates that base LLMs could effectively interpret and execute malicious instructions.
arXiv Detail & Related papers (2024-04-16T13:22:54Z)
Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks. We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z)
Mitigating Approximate Memorization in Language Models via Dissimilarity Learned Policy [0.0]
Large Language models (LLMs) are trained on large amounts of data. LLMs showed to memorize parts of the training data and emit those data verbatim when an adversary prompts appropriately.
arXiv Detail & Related papers (2023-05-02T15:53:28Z)
Exploring Memorization in Adversarial Training [58.38336773082818]
We investigate the memorization effect in adversarial training (AT) for promoting a deeper understanding of capacity, convergence, generalization, and especially robust overfitting. We propose a new mitigation algorithm motivated by detailed memorization analyses.
arXiv Detail & Related papers (2021-06-03T05:39:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.