DP-BART for Privatized Text Rewriting under Local Differential Privacy
- URL: http://arxiv.org/abs/2302.07636v2
- Date: Tue, 6 Jun 2023 14:17:46 GMT
- Title: DP-BART for Privatized Text Rewriting under Local Differential Privacy
- Authors: Timour Igamberdiev and Ivan Habernal
- Abstract summary: We propose a new system 'DP-BART' that largely outperforms existing LDP systems.
Our approach uses a novel clipping method, iterative pruning, and further training of internal representations which drastically reduces the amount of noise required for DP guarantees.
- Score: 2.45626162429986
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Privatized text rewriting with local differential privacy (LDP) is a recent
approach that enables sharing of sensitive textual documents while formally
guaranteeing privacy protection to individuals. However, existing systems face
several issues, such as formal mathematical flaws, unrealistic privacy
guarantees, privatization of only individual words, as well as a lack of
transparency and reproducibility. In this paper, we propose a new system
'DP-BART' that largely outperforms existing LDP systems. Our approach uses a
novel clipping method, iterative pruning, and further training of internal
representations which drastically reduces the amount of noise required for DP
guarantees. We run experiments on five textual datasets of varying sizes,
rewriting them at different privacy guarantees and evaluating the rewritten
texts on downstream text classification tasks. Finally, we thoroughly discuss
the privatized text rewriting approach and its limitations, including the
problem of the strict text adjacency constraint in the LDP paradigm that leads
to the high noise requirement.
Related papers
- On the Impact of Noise in Differentially Private Text Rewriting [3.0177210416625124]
We introduce a new sentence infilling privatization technique, and we use this method to explore the effect of noise in DP text rewriting.
We empirically demonstrate that non-DP privatization techniques excel in utility preservation and can find an acceptable empirical privacy-utility trade-off, yet cannot outperform DP methods in empirical privacy protections.
arXiv Detail & Related papers (2025-01-31T10:45:24Z) - Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties.
We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z) - Activity Recognition on Avatar-Anonymized Datasets with Masked Differential Privacy [64.32494202656801]
Privacy-preserving computer vision is an important emerging problem in machine learning and artificial intelligence.
We present anonymization pipeline that replaces sensitive human subjects in video datasets with synthetic avatars within context.
We also proposeMaskDP to protect non-anonymized but privacy sensitive background information.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Thinking Outside of the Differential Privacy Box: A Case Study in Text Privatization with Language Model Prompting [3.3916160303055567]
We discuss the restrictions that Differential Privacy (DP) integration imposes, as well as bring to light the challenges that such restrictions entail.
Our results demonstrate the need for more discussion on the usability of DP in NLP and its benefits over non-DP approaches.
arXiv Detail & Related papers (2024-10-01T14:46:15Z) - Just Rewrite It Again: A Post-Processing Method for Enhanced Semantic Similarity and Privacy Preservation of Differentially Private Rewritten Text [3.3916160303055567]
We propose a simple post-processing method based on the goal of aligning rewritten texts with their original counterparts.
Our results show that such an approach not only produces outputs that are more semantically reminiscent of the original inputs, but also texts which score on average better in empirical privacy evaluations.
arXiv Detail & Related papers (2024-05-30T08:41:33Z) - InferDPT: Privacy-Preserving Inference for Black-box Large Language Model [66.07752875835506]
InferDPT is the first practical framework for the privacy-preserving Inference of black-box LLMs.
RANTEXT is a novel differential privacy mechanism integrated into the perturbation module of InferDPT.
arXiv Detail & Related papers (2023-10-18T18:00:11Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - DP-Rewrite: Towards Reproducibility and Transparency in Differentially
Private Text Rewriting [2.465904360857451]
We introduce DP-Rewrite, an open-source framework for differentially private text rewriting.
Our system incorporates a variety of downstream datasets, models, pre-training procedures, and evaluation metrics.
We provide a set of experiments as a case study on the ADePT DP text rewriting system, detecting a privacy leak in its pre-training approach.
arXiv Detail & Related papers (2022-08-22T15:38:16Z) - Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP)
We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy.
Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z) - Beyond The Text: Analysis of Privacy Statements through Syntactic and
Semantic Role Labeling [12.74252812104216]
This paper formulates a new task of extracting privacy parameters from a privacy policy, through the lens of Contextual Integrity.
We show that traditional NLP tasks, including the recently proposed Question-Answering based solutions, are insufficient to address the privacy parameter extraction problem.
arXiv Detail & Related papers (2020-10-01T20:48:37Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.