Related papers: Emergent and Predictable Memorization in Large Language Models

Emergent and Predictable Memorization in Large Language Models

URL: http://arxiv.org/abs/2304.11158v2
Date: Wed, 31 May 2023 19:09:45 GMT
Title: Emergent and Predictable Memorization in Large Language Models
Authors: Stella Biderman and USVSN Sai Prashanth and Lintang Sutawika and Hailey Schoelkopf and Quentin Anthony and Shivanshu Purohit and Edward Raff
Abstract summary: Memorization, or the tendency of large language models to output entire sequences from their training data verbatim, is a key concern for safely deploying language models. We seek to predict which sequences will be memorized before a large model's full train-time by extrapolating the memorization behavior of lower-compute trial runs. We provide further novel discoveries on the distribution of memorization scores across models and data.
Score: 23.567027014457775
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Memorization, or the tendency of large language models (LLMs) to output entire sequences from their training data verbatim, is a key concern for safely deploying language models. In particular, it is vital to minimize a model's memorization of sensitive datapoints such as those containing personal identifiable information (PII). The prevalence of such undesirable memorization can pose issues for model trainers, and may even require discarding an otherwise functional model. We therefore seek to predict which sequences will be memorized before a large model's full train-time by extrapolating the memorization behavior of lower-compute trial runs. We measure memorization of the Pythia model suite and plot scaling laws for forecasting memorization, allowing us to provide equi-compute recommendations to maximize the reliability (recall) of such predictions. We additionally provide further novel discoveries on the distribution of memorization scores across models and data. We release all code and data necessary to reproduce the results in this paper at https://github.com/EleutherAI/pythia

Related papers

Measuring Déjà vu Memorization Efficiently [38.201992966736114]
Recent research has shown that representation learning models may accidentally memorize their training data. We propose alternative simple methods to estimate dataset-level correlations. These can be used to approximate an off-the-shelf model's memorization ability without any retraining.
arXiv Detail & Related papers (2025-04-08T03:55:20Z)
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling [53.54623137152208]
We introduce AutoElicit to extract knowledge from large language models and construct priors for predictive models. We show these priors are informative and can be refined using natural language. We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning.
arXiv Detail & Related papers (2024-11-26T10:13:39Z)
A Geometric Framework for Understanding Memorization in Generative Models [11.263296715798374]
Recent work has shown that deep generative models can be capable of memorizing and reproducing training datapoints when deployed. These findings call into question the usability of generative models, especially in light of the legal and privacy risks brought about by memorization. We propose the manifold memorization hypothesis (MMH), a geometric framework which leverages the manifold hypothesis into a clear language in which to reason about memorization.
arXiv Detail & Related papers (2024-10-31T18:09:01Z)
Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens [1.2549198550400134]
Large language models (LLMs) are extensively used, but there are concerns regarding privacy, security, and copyright due to their opaque training data. Current solutions to this problem leverage techniques explored in machine learning privacy such as Membership Inference Attacks (MIAs) We propose an adaptive pre-training data detection method which alleviates this reliance and effectively amplify the identification.
arXiv Detail & Related papers (2024-07-30T23:43:59Z)
Demystifying Verbatim Memorization in Large Language Models [67.49068128909349]
Large Language Models (LLMs) frequently memorize long sequences verbatim, often with serious legal and privacy implications. We develop a framework to study verbatim memorization in a controlled setting by continuing pre-training from Pythia checkpoints with injected sequences. We find that (1) non-trivial amounts of repetition are necessary for verbatim memorization to happen; (2) later (and presumably better) checkpoints are more likely to memorize verbatim sequences, even for out-of-distribution sequences.
arXiv Detail & Related papers (2024-07-25T07:10:31Z)
Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency. This study demonstrates that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z)
Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications. Memorisation is the causal effect of training with an instance on the model's ability to predict that instance. This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z)
Scalable Extraction of Training Data from (Production) Language Models [93.7746567808049]
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT.
arXiv Detail & Related papers (2023-11-28T18:47:03Z)
Quantifying and Analyzing Entity-level Memorization in Large Language Models [4.59914731734176]
Large language models (LLMs) have been proven capable of memorizing their training data. Privacy risks arising from memorization have attracted increasing attention. We propose a fine-grained, entity-level definition to quantify memorization with conditions and metrics closer to real-world scenarios.
arXiv Detail & Related papers (2023-08-30T03:06:47Z)
Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft Prompting and Calibrated Confidence Estimation [56.57532238195446]
We propose a method named Ethicist for targeted training data extraction. To elicit memorization, we tune soft prompt embeddings while keeping the model fixed. We show that Ethicist significantly improves the extraction performance on a recently proposed public benchmark.
arXiv Detail & Related papers (2023-07-10T08:03:41Z)
Quantifying Memorization Across Neural Language Models [61.58529162310382]
Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized data verbatim. This is undesirable because memorization violates privacy (exposing user data), degrades utility (repeated easy-to-memorize text is often low quality), and hurts fairness (some texts are memorized over others). We describe three log-linear relationships that quantify the degree to which LMs emit memorized training data.
arXiv Detail & Related papers (2022-02-15T18:48:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.