Sequence-Level Leakage Risk of Training Data in Large Language Models
- URL: http://arxiv.org/abs/2412.11302v2
- Date: Mon, 03 Feb 2025 19:53:16 GMT
- Title: Sequence-Level Leakage Risk of Training Data in Large Language Models
- Authors: Trishita Tiwari, G. Edward Suh,
- Abstract summary: Per-sequence extraction probabilities provide finer-grained information than has been studied in prior work.
We perform this study on two pre-trained models, Llama and OPT, trained on the Common Crawl and The Pile respectively.
- Score: 7.600279942640982
- License:
- Abstract: This work performs an analysis of sequence-level probabilities for quantifying the of risk training data extraction from Large Language Models (LLMs). Per-sequence extraction probabilities provide finer-grained information than has been studied in prior work. We re-analyze the effects of decoding schemes, model sizes, prefix lengths, partial sequence leakages, and token positions to uncover new insights that were not possible in previous works due to their choice of metrics. We perform this study on two pre-trained models, Llama and OPT, trained on the Common Crawl and The Pile respectively. We discover that 1) Extraction Rate, the predominant metric used in prior quantification work, underestimates the threat of leakage of training data in randomized LLMs by as much as 2.14X. 2) Although on average, larger models and longer prefixes can extract more data, this is not true for a substantial portion of individual sequences. 30.4-41.5% of our sequences are easier to extract with either shorter prefixes or smaller models. 3) Contrary to previous beliefs, partial leakage in commonly used decoding schemes like top-k and top-p is not easier than leaking verbatim training data. 4) Extracting later tokens in a sequence is as much as 10.12X easier than extracting earlier tokens. The insights gained from our analysis shed light on the nature of memorization of training data on a per-sequence basis.
Related papers
- How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks before and after Fine-tuning [2.3759432635713895]
We attack both pre-trained and fine-tuned code language models to investigate the extent of data extractability.
Fine-tuning requires fewer resources and is increasingly used by both small and large entities for its effectiveness on specialized data.
Data carriers and licensing information are the most likely data to be memorized from pre-trained and fine-tuned models, while the latter is the most likely to be forgotten after fine-tuning.
arXiv Detail & Related papers (2025-01-29T09:17:30Z) - Training on the Benchmark Is Not All You Need [52.01920740114261]
We propose a simple and effective data leakage detection method based on the contents of multiple-choice options.
Our method is able to work under black-box conditions without access to model training data or weights.
We evaluate the degree of data leakage of 31 mainstream open-source LLMs on four benchmark datasets.
arXiv Detail & Related papers (2024-09-03T11:09:44Z) - Adaptive Pre-training Data Detection for Large Language Models via Surprising Tokens [1.2549198550400134]
Large language models (LLMs) are extensively used, but there are concerns regarding privacy, security, and copyright due to their opaque training data.
Current solutions to this problem leverage techniques explored in machine learning privacy such as Membership Inference Attacks (MIAs)
We propose an adaptive pre-training data detection method which alleviates this reliance and effectively amplify the identification.
arXiv Detail & Related papers (2024-07-30T23:43:59Z) - Generalization v.s. Memorization: Tracing Language Models' Capabilities Back to Pretraining Data [76.90128359866462]
We introduce an extended concept of memorization, distributional memorization, which measures the correlation between the output probabilities and the pretraining data frequency.
We show that memorization plays a larger role in simpler, knowledge-intensive tasks, while generalization is the key for harder, reasoning-based tasks.
arXiv Detail & Related papers (2024-07-20T21:24:40Z) - Alpaca against Vicuna: Using LLMs to Uncover Memorization of LLMs [61.04246774006429]
We introduce a black-box prompt optimization method that uses an attacker LLM agent to uncover higher levels of memorization in a victim agent.
We observe that our instruction-based prompts generate outputs with 23.7% higher overlap with training data compared to the baseline prefix-suffix measurements.
Our findings show that instruction-tuned models can expose pre-training data as much as their base-models, if not more so, and using instructions proposed by other LLMs can open a new avenue of automated attacks.
arXiv Detail & Related papers (2024-03-05T19:32:01Z) - Ethicist: Targeted Training Data Extraction Through Loss Smoothed Soft
Prompting and Calibrated Confidence Estimation [56.57532238195446]
We propose a method named Ethicist for targeted training data extraction.
To elicit memorization, we tune soft prompt embeddings while keeping the model fixed.
We show that Ethicist significantly improves the extraction performance on a recently proposed public benchmark.
arXiv Detail & Related papers (2023-07-10T08:03:41Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - Emergent and Predictable Memorization in Large Language Models [23.567027014457775]
Memorization, or the tendency of large language models to output entire sequences from their training data verbatim, is a key concern for safely deploying language models.
We seek to predict which sequences will be memorized before a large model's full train-time by extrapolating the memorization behavior of lower-compute trial runs.
We provide further novel discoveries on the distribution of memorization scores across models and data.
arXiv Detail & Related papers (2023-04-21T17:58:31Z) - Bag of Tricks for Training Data Extraction from Language Models [98.40637430115204]
We investigate and benchmark tricks for improving training data extraction using a publicly available dataset.
The experimental results show that several previously overlooked tricks can be crucial to the success of training data extraction.
arXiv Detail & Related papers (2023-02-09T06:46:42Z) - Numeracy from Literacy: Data Science as an Emergent Skill from Large
Language Models [0.0]
Large language models (LLM) such as OpenAI's ChatGPT and GPT-3 offer unique testbeds for exploring the translation challenges of turning literacy into numeracy.
Previous publicly-available transformer models from eighteen months prior and 1000 times smaller failed to provide basic arithmetic.
This work examines whether next-token prediction succeeds from sentence completion into the realm of actual numerical understanding.
arXiv Detail & Related papers (2023-01-31T03:14:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.