Protecting Global Properties of Datasets with Distribution Privacy
Mechanisms
- URL: http://arxiv.org/abs/2207.08367v2
- Date: Mon, 10 Apr 2023 12:04:14 GMT
- Title: Protecting Global Properties of Datasets with Distribution Privacy
Mechanisms
- Authors: Michelle Chen and Olga Ohrimenko
- Abstract summary: We show how a distribution privacy framework can be applied to formalize such data confidentiality.
We then empirically evaluate the privacy-utility tradeoffs of these mechanisms and apply them against a practical property inference attack.
- Score: 8.19841678851784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of ensuring confidentiality of dataset properties
aggregated over many records of a dataset. Such properties can encode sensitive
information, such as trade secrets or demographic data, while involving a
notion of data protection different to the privacy of individual records
typically discussed in the literature. In this work, we demonstrate how a
distribution privacy framework can be applied to formalize such data
confidentiality. We extend the Wasserstein Mechanism from Pufferfish privacy
and the Gaussian Mechanism from attribute privacy to this framework, then
analyze their underlying data assumptions and how they can be relaxed. We then
empirically evaluate the privacy-utility tradeoffs of these mechanisms and
apply them against a practical property inference attack which targets global
properties of datasets. The results show that our mechanisms can indeed reduce
the effectiveness of the attack while providing utility substantially greater
than a crude group differential privacy baseline. Our work thus provides
groundwork for theoretical mechanisms for protecting global properties of
datasets along with their evaluation in practice.
Related papers
- Bayes-Nash Generative Privacy Protection Against Membership Inference Attacks [24.330984323956173]
We propose a game model for privacy-preserving publishing of data-sharing mechanism outputs.
We introduce the notions of Bayes-Nash generative privacy (BNGP) and Bayes generative privacy (BGP) risk.
We apply our method to sharing summary statistics, where MIAs can re-identify individuals even from aggregated data.
arXiv Detail & Related papers (2024-10-09T20:29:04Z) - Enhancing User-Centric Privacy Protection: An Interactive Framework through Diffusion Models and Machine Unlearning [54.30994558765057]
The study pioneers a comprehensive privacy protection framework that safeguards image data privacy concurrently during data sharing and model publication.
We propose an interactive image privacy protection framework that utilizes generative machine learning models to modify image information at the attribute level.
Within this framework, we instantiate two modules: a differential privacy diffusion model for protecting attribute information in images and a feature unlearning algorithm for efficient updates of the trained model on the revised image dataset.
arXiv Detail & Related papers (2024-09-05T07:55:55Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - $\alpha$-Mutual Information: A Tunable Privacy Measure for Privacy
Protection in Data Sharing [4.475091558538915]
This paper adopts Arimoto's $alpha$-Mutual Information as a tunable privacy measure.
We formulate a general distortion-based mechanism that manipulates the original data to offer privacy protection.
arXiv Detail & Related papers (2023-10-27T16:26:14Z) - PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind)
Our work offers a theoretical analysis for model design and benchmarks various techniques.
In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - A Linear Reconstruction Approach for Attribute Inference Attacks against Synthetic Data [1.5293427903448022]
We introduce a new attribute inference attack against synthetic data.
We show that our attack can be highly accurate even on arbitrary records.
We then evaluate the tradeoff between protecting privacy and preserving statistical utility.
arXiv Detail & Related papers (2023-01-24T14:56:36Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Causally Constrained Data Synthesis for Private Data Release [36.80484740314504]
Using synthetic data which reflects certain statistical properties of the original data preserves the privacy of the original data.
Prior works utilize differentially private data release mechanisms to provide formal privacy guarantees.
We propose incorporating causal information into the training process to favorably modify the aforementioned trade-off.
arXiv Detail & Related papers (2021-05-27T13:46:57Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Attribute Privacy: Framework and Mechanisms [26.233612860653025]
We study the study of attribute privacy, where a data owner is concerned about revealing sensitive properties of a whole dataset during analysis.
We propose definitions to capture emphattribute privacy in two relevant cases where global attributes may need to be protected.
We provide two efficient mechanisms and one inefficient mechanism that satisfy attribute privacy for these settings.
arXiv Detail & Related papers (2020-09-08T22:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.