Real-Time Privacy Risk Measurement with Privacy Tokens for Gradient Leakage
- URL: http://arxiv.org/abs/2502.02913v4
- Date: Wed, 12 Feb 2025 04:59:16 GMT
- Title: Real-Time Privacy Risk Measurement with Privacy Tokens for Gradient Leakage
- Authors: Jiayang Meng, Tao Huang, Hong Chen, Xin Shi, Qingyu Huang, Chen Hou,
- Abstract summary: Deep learning models in privacy-sensitive domains have amplified concerns regarding privacy risks.
We propose the concept of privacy tokens, which are derived directly from private gradients during training.
Privacy tokens offer valuable insights into the extent of private information leakage from training data.
We employ Mutual Information (MI) as a robust metric to quantify the relationship between training data and gradients.
- Score: 15.700803673467641
- License:
- Abstract: The widespread deployment of deep learning models in privacy-sensitive domains has amplified concerns regarding privacy risks, particularly those stemming from gradient leakage during training. Current privacy assessments primarily rely on post-training attack simulations. However, these methods are inherently reactive, unable to encompass all potential attack scenarios, and often based on idealized adversarial assumptions. These limitations underscore the need for proactive approaches to privacy risk assessment during the training process. To address this gap, we propose the concept of privacy tokens, which are derived directly from private gradients during training. Privacy tokens encapsulate gradient features and, when combined with data features, offer valuable insights into the extent of private information leakage from training data, enabling real-time measurement of privacy risks without relying on adversarial attack simulations. Additionally, we employ Mutual Information (MI) as a robust metric to quantify the relationship between training data and gradients, providing precise and continuous assessments of privacy leakage throughout the training process. Extensive experiments validate our framework, demonstrating the effectiveness of privacy tokens and MI in identifying and quantifying privacy risks. This proactive approach marks a significant advancement in privacy monitoring, promoting the safer deployment of deep learning models in sensitive applications.
Related papers
- Enforcing Demographic Coherence: A Harms Aware Framework for Reasoning about Private Data Release [14.939460540040459]
We introduce demographic coherence, a condition inspired by privacy attacks that we argue is necessary for data privacy.
Our framework focuses on confidence rated predictors, which can in turn be distilled from almost any data-informed process.
We prove that every differentially private data release is also demographically coherent, and that there are demographically coherent algorithms which are not differentially private.
arXiv Detail & Related papers (2025-02-04T20:42:30Z) - Privacy in Fine-tuning Large Language Models: Attacks, Defenses, and Future Directions [11.338466798715906]
Fine-tuning Large Language Models (LLMs) can achieve state-of-the-art performance across various domains.
This paper provides a comprehensive survey of privacy challenges associated with fine-tuning LLMs.
We highlight vulnerabilities to various privacy attacks, including membership inference, data extraction, and backdoor attacks.
arXiv Detail & Related papers (2024-12-21T06:41:29Z) - Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence.
We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context.
We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - PrivacyLens: Evaluating Privacy Norm Awareness of Language Models in Action [54.11479432110771]
PrivacyLens is a novel framework designed to extend privacy-sensitive seeds into expressive vignettes and further into agent trajectories.
We instantiate PrivacyLens with a collection of privacy norms grounded in privacy literature and crowdsourced seeds.
State-of-the-art LMs, like GPT-4 and Llama-3-70B, leak sensitive information in 25.68% and 38.69% of cases, even when prompted with privacy-enhancing instructions.
arXiv Detail & Related papers (2024-08-29T17:58:38Z) - Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions [12.451936012379319]
Large Language Models (LLMs) represent a significant advancement in artificial intelligence, finding applications across various domains.
Their reliance on massive internet-sourced datasets for training brings notable privacy issues.
Certain application-specific scenarios may require fine-tuning these models on private data.
arXiv Detail & Related papers (2024-08-10T05:41:19Z) - Collection, usage and privacy of mobility data in the enterprise and public administrations [55.2480439325792]
Security measures such as anonymization are needed to protect individuals' privacy.
Within our study, we conducted expert interviews to gain insights into practices in the field.
We survey privacy-enhancing methods in use, which generally do not comply with state-of-the-art standards of differential privacy.
arXiv Detail & Related papers (2024-07-04T08:29:27Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - On the Privacy Effect of Data Enhancement via the Lens of Memorization [20.63044895680223]
We propose to investigate privacy from a new perspective called memorization.
Through the lens of memorization, we find that previously deployed MIAs produce misleading results as they are less likely to identify samples with higher privacy risks.
We demonstrate that the generalization gap and privacy leakage are less correlated than those of the previous results.
arXiv Detail & Related papers (2022-08-17T13:02:17Z) - Systematic Evaluation of Privacy Risks of Machine Learning Models [41.017707772150835]
We show that prior work on membership inference attacks may severely underestimate the privacy risks.
We first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks.
We then introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score.
arXiv Detail & Related papers (2020-03-24T00:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.