Related papers: Training Data Leakage Analysis in Language Models

Training Data Leakage Analysis in Language Models

URL: http://arxiv.org/abs/2101.05405v2
Date: Mon, 22 Feb 2021 23:53:08 GMT
Title: Training Data Leakage Analysis in Language Models
Authors: Huseyin A. Inan, Osman Ramadan, Lukas Wutschitz, Daniel Jones, Victor R\"uhle, James Withers, Robert Sim
Abstract summary: We introduce a methodology that investigates identifying the user content in the training data that could be leaked under a strong and realistic threat model. We propose two metrics to quantify user-level data leakage by measuring a model's ability to produce unique sentence fragments within training data.
Score: 6.843491191969066
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent advances in neural network based language models lead to successful deployments of such models, improving user experience in various applications. It has been demonstrated that strong performance of language models comes along with the ability to memorize rare training samples, which poses serious privacy threats in case the model is trained on confidential user content. In this work, we introduce a methodology that investigates identifying the user content in the training data that could be leaked under a strong and realistic threat model. We propose two metrics to quantify user-level data leakage by measuring a model's ability to produce unique sentence fragments within training data. Our metrics further enable comparing different models trained on the same data in terms of privacy. We demonstrate our approach through extensive numerical studies on both RNN and Transformer based models. We further illustrate how the proposed metrics can be utilized to investigate the efficacy of mitigations like differentially private training or API hardening.

Related papers

Deep Contrastive Unlearning for Language Models [9.36216515987051]
We propose a machine unlearning framework, named Deep Contrastive Unlearning for fine-Tuning (DeepCUT) language models. Our proposed model achieves machine unlearning by directly optimizing the latent space of a model.
arXiv Detail & Related papers (2025-03-19T04:58:45Z)
Transferable Post-training via Inverse Value Learning [83.75002867411263]
We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network) After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference. We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
arXiv Detail & Related papers (2024-10-28T13:48:43Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack. We exploit text similarity and the model's resistance to document modifications as potential MI signals. We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z)
Recovering from Privacy-Preserving Masking with Large Language Models [14.828717714653779]
We use large language models (LLMs) to suggest substitutes of masked tokens. We show that models trained on the obfuscation corpora are able to achieve comparable performance with the ones trained on the original data.
arXiv Detail & Related papers (2023-09-12T16:39:41Z)
Tools for Verifying Neural Models' Training Data [29.322899317216407]
"Proof-of-Training-Data" allows a model trainer to convince a Verifier of the training data that produced a set of model weights. We show experimentally that our verification procedures can catch a wide variety of attacks.
arXiv Detail & Related papers (2023-07-02T23:27:00Z)
TRAK: Attributing Model Behavior at Scale [79.56020040993947]
We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
arXiv Detail & Related papers (2023-03-24T17:56:22Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Q-LSTM Language Model -- Decentralized Quantum Multilingual Pre-Trained Language Model for Privacy Protection [6.0038761646405225]
Large-scale language models are trained on a massive amount of natural language data that might encode or reflect our private information. malicious agents can reverse engineer the training data even if data sanitation and differential privacy algorithms were involved in the pre-training process. We propose a decentralized training framework to address privacy concerns in training large-scale language models.
arXiv Detail & Related papers (2022-10-06T21:29:17Z)
Leveraging Adversarial Examples to Quantify Membership Information Leakage [30.55736840515317]
We develop a novel approach to address the problem of membership inference in pattern recognition models. We argue that this quantity reflects the likelihood of belonging to the training data. Our method performs comparable or even outperforms state-of-the-art strategies.
arXiv Detail & Related papers (2022-03-17T19:09:38Z)
How much pretraining data do language models need to learn syntax? [12.668478784932878]
Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks. We study the impact of pretraining data size on the knowledge of the models using RoBERTa.
arXiv Detail & Related papers (2021-09-07T15:51:39Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Data Augmentation for Spoken Language Understanding via Pretrained Language Models [113.56329266325902]
Training of spoken language understanding (SLU) models often faces the problem of data scarcity. We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
arXiv Detail & Related papers (2020-04-29T04:07:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.