Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
- URL: http://arxiv.org/abs/2502.15680v1
- Date: Fri, 21 Feb 2025 18:59:14 GMT
- Title: Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training
- Authors: Jaydeep Borkar, Matthew Jagielski, Katherine Lee, Niloofar Mireshghallah, David A. Smith, Christopher A. Choquette-Choo,
- Abstract summary: We find that the amount and ease of PII is a dynamic property of a model that evolves throughout training pipelines.<n>We characterize three such novel phenomena: (1) similar-appearing PII seen later in training can elicit memorization of earlier-seen sequences in what we call assisted memorization.
- Score: 19.119349775283556
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due to the sensitive nature of personally identifiable information (PII), its owners may have the authority to control its inclusion or request its removal from large-language model (LLM) training. Beyond this, PII may be added or removed from training datasets due to evolving dataset curation techniques, because they were newly scraped for retraining, or because they were included in a new downstream fine-tuning stage. We find that the amount and ease of PII memorization is a dynamic property of a model that evolves throughout training pipelines and depends on commonly altered design choices. We characterize three such novel phenomena: (1) similar-appearing PII seen later in training can elicit memorization of earlier-seen sequences in what we call assisted memorization, and this is a significant factor (in our settings, up to 1/3); (2) adding PII can increase memorization of other PII significantly (in our settings, as much as $\approx\!7.5\times$); and (3) removing PII can lead to other PII being memorized. Model creators should consider these first- and second-order privacy risks when training models to avoid the risk of new PII regurgitation.
Related papers
- ReCIT: Reconstructing Full Private Data from Gradient in Parameter-Efficient Fine-Tuning of Large Language Models [45.10098466182961]
ReCIT is a novel privacy attack that achieves recovery of emphfull private data from PEFT gradients with high fidelity.
It consistently outperforms state-of-the-art memorization and inversion-based attacks across different PEFT paradigms.
arXiv Detail & Related papers (2025-04-29T09:23:19Z) - R.R.: Unveiling LLM Training Privacy through Recollection and Ranking [17.12953978321457]
Large Language Models (LLMs) pose significant privacy risks, potentially leaking training data due to implicit memorization.<n>We propose R.R. (Recollect and Rank), a novel two-step privacy stealing attack that enables attackers to reconstruct PII entities from scrubbed training data.<n> Experiments across three popular PII datasets demonstrate that the R.R. achieves better PII identical performance compared to baselines.
arXiv Detail & Related papers (2025-02-18T09:05:59Z) - Mitigating Unintended Memorization with LoRA in Federated Learning for LLMs [30.13601588296921]
Federated learning (FL) is a popular paradigm for collaborative training which avoids direct data exposure between clients.<n>It is possible for adversarial and honest-but-curious clients to recover training data of other participants simply through targeted prompting.<n>We demonstrate that a popular and simple fine-tuning strategy, low-rank adaptation (LoRA), reduces memorization during FL up to a factor of 10.
arXiv Detail & Related papers (2025-02-07T17:04:39Z) - Learn while Unlearn: An Iterative Unlearning Framework for Generative Language Models [49.043599241803825]
Iterative Contrastive Unlearning (ICU) framework consists of three core components.
A Knowledge Unlearning Induction module removes specific knowledge through an unlearning loss.
A Contrastive Learning Enhancement module to preserve the model's expressive capabilities against the pure unlearning goal.
And an Iterative Unlearning Refinement module that dynamically assess the unlearning extent on specific data pieces and make iterative update.
arXiv Detail & Related papers (2024-07-25T07:09:35Z) - Extracting Training Data from Document-Based VQA Models [67.1470112451617]
Vision-Language Models (VLMs) have made remarkable progress in document-based Visual Question Answering (i.e., responding to queries about the contents of an input document provided as an image)
We show these models can memorise responses for training samples and regurgitate them even when the relevant visual information has been removed.
This includes Personal Identifiable Information repeated once in the training set, indicating these models could divulge sensitive information and therefore pose a privacy risk.
arXiv Detail & Related papers (2024-07-11T17:44:41Z) - Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM)
The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory.
PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z) - Adaptive Cross Batch Normalization for Metric Learning [75.91093210956116]
Metric learning is a fundamental problem in computer vision.
We show that it is equally important to ensure that the accumulated embeddings are up to date.
In particular, it is necessary to circumvent the representational drift between the accumulated embeddings and the feature embeddings at the current training iteration.
arXiv Detail & Related papers (2023-03-30T03:22:52Z) - PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain.
Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z) - Quantifying Memorization Across Neural Language Models [61.58529162310382]
Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized data verbatim.
This is undesirable because memorization violates privacy (exposing user data), degrades utility (repeated easy-to-memorize text is often low quality), and hurts fairness (some texts are memorized over others).
We describe three log-linear relationships that quantify the degree to which LMs emit memorized training data.
arXiv Detail & Related papers (2022-02-15T18:48:31Z) - Automated PII Extraction from Social Media for Raising Privacy
Awareness: A Deep Transfer Learning Approach [6.806025738284367]
Internet users have been exposing an increasing amount of Personally Identifiable Information (PII) on social media.
In this study, we propose the Deep Transfer Learning for PII Extraction (DTL-PIIE) framework to address these two limitations.
Our framework can facilitate various applications, such as PII misuse prediction and privacy risk assessment.
arXiv Detail & Related papers (2021-11-11T19:32:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.