Differentially Private Representation for NLP: Formal Guarantee and An
Empirical Study on Privacy and Fairness
- URL: http://arxiv.org/abs/2010.01285v1
- Date: Sat, 3 Oct 2020 05:58:32 GMT
- Title: Differentially Private Representation for NLP: Formal Guarantee and An
Empirical Study on Privacy and Fairness
- Authors: Lingjuan Lyu, Xuanli He, Yitong Li
- Abstract summary: It has been demonstrated that hidden representation learned by a deep model can encode private information of the input.
We propose Differentially Private Neural Representation (DPNR) to preserve the privacy of the extracted representation from text.
- Score: 38.90014773292902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been demonstrated that hidden representation learned by a deep model
can encode private information of the input, hence can be exploited to recover
such information with reasonable accuracy. To address this issue, we propose a
novel approach called Differentially Private Neural Representation (DPNR) to
preserve the privacy of the extracted representation from text. DPNR utilises
Differential Privacy (DP) to provide a formal privacy guarantee. Further, we
show that masking words via dropout can further enhance privacy. To maintain
utility of the learned representation, we integrate DP-noisy representation
into a robust training process to derive a robust target model, which also
helps for model fairness over various demographic variables. Experimental
results on benchmark datasets under various parameter settings demonstrate that
DPNR largely reduces privacy leakage without significantly sacrificing the main
task performance.
Related papers
- Differentially Private Fine-Tuning of Diffusion Models [22.454127503937883]
The integration of Differential Privacy with diffusion models (DMs) presents a promising yet challenging frontier.
Recent developments in this field have highlighted the potential for generating high-quality synthetic data by pre-training on public data.
We propose a strategy optimized for private diffusion models, which minimizes the number of trainable parameters to enhance the privacy-utility trade-off.
arXiv Detail & Related papers (2024-06-03T14:18:04Z) - Probing the Transition to Dataset-Level Privacy in ML Models Using an
Output-Specific and Data-Resolved Privacy Profile [23.05994842923702]
We study a privacy metric that quantifies the extent to which a model trained on a dataset using a Differential Privacy mechanism is covered" by each of the distributions resulting from training on neighboring datasets.
We show that the privacy profile can be used to probe an observed transition to indistinguishability that takes place in the neighboring distributions as $epsilon$ decreases.
arXiv Detail & Related papers (2023-06-27T20:39:07Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Fair NLP Models with Differentially Private Text Encoders [1.7434507809930746]
We propose FEDERATE, an approach that combines ideas from differential privacy and adversarial training to learn private text representations.
We empirically evaluate the trade-off between the privacy of the representations and the fairness and accuracy of the downstream model on four NLP datasets.
arXiv Detail & Related papers (2022-05-12T14:58:38Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Federated Deep Learning with Bayesian Privacy [28.99404058773532]
Federated learning (FL) aims to protect data privacy by cooperatively learning a model without sharing private data among users.
Homomorphic encryption (HE) based methods provide secure privacy protections but suffer from extremely high computational and communication overheads.
Deep learning with Differential Privacy (DP) was implemented as a practical learning algorithm at a manageable cost in complexity.
arXiv Detail & Related papers (2021-09-27T12:48:40Z) - Partial sensitivity analysis in differential privacy [58.730520380312676]
We investigate the impact of each input feature on the individual's privacy loss.
We experimentally evaluate our approach on queries over private databases.
We also explore our findings in the context of neural network training on synthetic data.
arXiv Detail & Related papers (2021-09-22T08:29:16Z) - PEARL: Data Synthesis via Private Embeddings and Adversarial
Reconstruction Learning [1.8692254863855962]
We propose a new framework of data using deep generative models in a differentially private manner.
Within our framework, sensitive data are sanitized with rigorous privacy guarantees in a one-shot fashion.
Our proposal has theoretical guarantees of performance, and empirical evaluations on multiple datasets show that our approach outperforms other methods at reasonable levels of privacy.
arXiv Detail & Related papers (2021-06-08T18:00:01Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.