Differentially Private Representation for NLP: Formal Guarantee and An
Empirical Study on Privacy and Fairness
- URL: http://arxiv.org/abs/2010.01285v1
- Date: Sat, 3 Oct 2020 05:58:32 GMT
- Title: Differentially Private Representation for NLP: Formal Guarantee and An
Empirical Study on Privacy and Fairness
- Authors: Lingjuan Lyu, Xuanli He, Yitong Li
- Abstract summary: It has been demonstrated that hidden representation learned by a deep model can encode private information of the input.
We propose Differentially Private Neural Representation (DPNR) to preserve the privacy of the extracted representation from text.
- Score: 38.90014773292902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has been demonstrated that hidden representation learned by a deep model
can encode private information of the input, hence can be exploited to recover
such information with reasonable accuracy. To address this issue, we propose a
novel approach called Differentially Private Neural Representation (DPNR) to
preserve the privacy of the extracted representation from text. DPNR utilises
Differential Privacy (DP) to provide a formal privacy guarantee. Further, we
show that masking words via dropout can further enhance privacy. To maintain
utility of the learned representation, we integrate DP-noisy representation
into a robust training process to derive a robust target model, which also
helps for model fairness over various demographic variables. Experimental
results on benchmark datasets under various parameter settings demonstrate that
DPNR largely reduces privacy leakage without significantly sacrificing the main
task performance.
Related papers
- Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Private Language Models via Truncated Laplacian Mechanism [18.77713904999236]
We propose a novel private embedding method called the high dimensional truncated Laplacian mechanism.
We show that our method has a lower variance compared to the previous private word embedding methods.
Remarkably, even in the high privacy regime, our approach only incurs a slight decrease in utility compared to the non-private scenario.
arXiv Detail & Related papers (2024-10-10T15:25:02Z) - Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training [31.559864332056648]
We propose a generic differential privacy framework with heterogeneous noise (DP-Hero)
Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters.
We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
arXiv Detail & Related papers (2024-09-05T08:40:54Z) - Probing the Transition to Dataset-Level Privacy in ML Models Using an
Output-Specific and Data-Resolved Privacy Profile [23.05994842923702]
We study a privacy metric that quantifies the extent to which a model trained on a dataset using a Differential Privacy mechanism is covered" by each of the distributions resulting from training on neighboring datasets.
We show that the privacy profile can be used to probe an observed transition to indistinguishability that takes place in the neighboring distributions as $epsilon$ decreases.
arXiv Detail & Related papers (2023-06-27T20:39:07Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Fair NLP Models with Differentially Private Text Encoders [1.7434507809930746]
We propose FEDERATE, an approach that combines ideas from differential privacy and adversarial training to learn private text representations.
We empirically evaluate the trade-off between the privacy of the representations and the fairness and accuracy of the downstream model on four NLP datasets.
arXiv Detail & Related papers (2022-05-12T14:58:38Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Federated Deep Learning with Bayesian Privacy [28.99404058773532]
Federated learning (FL) aims to protect data privacy by cooperatively learning a model without sharing private data among users.
Homomorphic encryption (HE) based methods provide secure privacy protections but suffer from extremely high computational and communication overheads.
Deep learning with Differential Privacy (DP) was implemented as a practical learning algorithm at a manageable cost in complexity.
arXiv Detail & Related papers (2021-09-27T12:48:40Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.