Training Data Leakage Analysis in Language Models
        - URL: http://arxiv.org/abs/2101.05405v2
- Date: Mon, 22 Feb 2021 23:53:08 GMT
- Title: Training Data Leakage Analysis in Language Models
- Authors: Huseyin A. Inan, Osman Ramadan, Lukas Wutschitz, Daniel Jones, Victor
  R\"uhle, James Withers, Robert Sim
- Abstract summary: We introduce a methodology that investigates identifying the user content in the training data that could be leaked under a strong and realistic threat model.
We propose two metrics to quantify user-level data leakage by measuring a model's ability to produce unique sentence fragments within training data.
- Score: 6.843491191969066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract:   Recent advances in neural network based language models lead to successful
deployments of such models, improving user experience in various applications.
It has been demonstrated that strong performance of language models comes along
with the ability to memorize rare training samples, which poses serious privacy
threats in case the model is trained on confidential user content. In this
work, we introduce a methodology that investigates identifying the user content
in the training data that could be leaked under a strong and realistic threat
model. We propose two metrics to quantify user-level data leakage by measuring
a model's ability to produce unique sentence fragments within training data.
Our metrics further enable comparing different models trained on the same data
in terms of privacy. We demonstrate our approach through extensive numerical
studies on both RNN and Transformer based models. We further illustrate how the
proposed metrics can be utilized to investigate the efficacy of mitigations
like differentially private training or API hardening.
 
      
        Related papers
        - Deep Contrastive Unlearning for Language Models [9.36216515987051]
 We propose a machine unlearning framework, named Deep Contrastive Unlearning for fine-Tuning (DeepCUT) language models.
Our proposed model achieves machine unlearning by directly optimizing the latent space of a model.
 arXiv  Detail & Related papers  (2025-03-19T04:58:45Z)
- Transferable Post-training via Inverse Value Learning [83.75002867411263]
 We propose modeling changes at the logits level during post-training using a separate neural network (i.e., the value network)
After training this network on a small base model using demonstrations, this network can be seamlessly integrated with other pre-trained models during inference.
We demonstrate that the resulting value network has broad transferability across pre-trained models of different parameter sizes.
 arXiv  Detail & Related papers  (2024-10-28T13:48:43Z)
- Fantastic Gains and Where to Find Them: On the Existence and Prospect of
  General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
 We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
 arXiv  Detail & Related papers  (2023-10-26T17:59:46Z)
- Assessing Privacy Risks in Language Models: A Case Study on
  Summarization Tasks [65.21536453075275]
 We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
 arXiv  Detail & Related papers  (2023-10-20T05:44:39Z)
- Recovering from Privacy-Preserving Masking with Large Language Models [14.828717714653779]
 We use large language models (LLMs) to suggest substitutes of masked tokens.
We show that models trained on the obfuscation corpora are able to achieve comparable performance with the ones trained on the original data.
 arXiv  Detail & Related papers  (2023-09-12T16:39:41Z)
- Tools for Verifying Neural Models' Training Data [29.322899317216407]
 "Proof-of-Training-Data" allows a model trainer to convince a Verifier of the training data that produced a set of model weights.
We show experimentally that our verification procedures can catch a wide variety of attacks.
 arXiv  Detail & Related papers  (2023-07-02T23:27:00Z)
- TRAK: Attributing Model Behavior at Scale [79.56020040993947]
 We present TRAK (Tracing with Randomly-trained After Kernel), a data attribution method that is both effective and computationally tractable for large-scale, differenti models.
 arXiv  Detail & Related papers  (2023-03-24T17:56:22Z)
- Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
 Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
 arXiv  Detail & Related papers  (2022-12-19T20:46:43Z)
- Q-LSTM Language Model -- Decentralized Quantum Multilingual Pre-Trained
  Language Model for Privacy Protection [6.0038761646405225]
 Large-scale language models are trained on a massive amount of natural language data that might encode or reflect our private information.
 malicious agents can reverse engineer the training data even if data sanitation and differential privacy algorithms were involved in the pre-training process.
We propose a decentralized training framework to address privacy concerns in training large-scale language models.
 arXiv  Detail & Related papers  (2022-10-06T21:29:17Z)
- Leveraging Adversarial Examples to Quantify Membership Information
  Leakage [30.55736840515317]
 We develop a novel approach to address the problem of membership inference in pattern recognition models.
We argue that this quantity reflects the likelihood of belonging to the training data.
Our method performs comparable or even outperforms state-of-the-art strategies.
 arXiv  Detail & Related papers  (2022-03-17T19:09:38Z)
- How much pretraining data do language models need to learn syntax? [12.668478784932878]
 Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks.
We study the impact of pretraining data size on the knowledge of the models using RoBERTa.
 arXiv  Detail & Related papers  (2021-09-07T15:51:39Z)
- Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
 We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
 arXiv  Detail & Related papers  (2020-10-24T11:55:28Z)
- Data Augmentation for Spoken Language Understanding via Pretrained
  Language Models [113.56329266325902]
 Training of spoken language understanding (SLU) models often faces the problem of data scarcity.
We put forward a data augmentation method using pretrained language models to boost the variability and accuracy of generated utterances.
 arXiv  Detail & Related papers  (2020-04-29T04:07:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.