Distribution-Invariant Differential Privacy
- URL: http://arxiv.org/abs/2111.05791v1
- Date: Mon, 8 Nov 2021 22:26:50 GMT
- Title: Distribution-Invariant Differential Privacy
- Authors: Xuan Bi and Xiaotong Shen
- Abstract summary: We develop a distribution-invariant privatization (DIP) method to reconcile high statistical accuracy and strict differential privacy.
Under the same strictness of privacy protection, DIP achieves superior statistical accuracy in two simulations and on three real-world benchmarks.
- Score: 4.700764053354502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential privacy is becoming one gold standard for protecting the privacy
of publicly shared data. It has been widely used in social science, data
science, public health, information technology, and the U.S. decennial census.
Nevertheless, to guarantee differential privacy, existing methods may
unavoidably alter the conclusion of original data analysis, as privatization
often changes the sample distribution. This phenomenon is known as the
trade-off between privacy protection and statistical accuracy. In this work, we
break this trade-off by developing a distribution-invariant privatization (DIP)
method to reconcile both high statistical accuracy and strict differential
privacy. As a result, any downstream statistical or machine learning task
yields essentially the same conclusion as if one used the original data.
Numerically, under the same strictness of privacy protection, DIP achieves
superior statistical accuracy in two simulations and on three real-world
benchmarks.
Related papers
- Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Differentially Private Covariate Balancing Causal Inference [8.133739801185271]
Differential privacy is the leading mathematical framework for privacy protection.
Our algorithm produces both point and interval estimators with statistical guarantees, such as consistency and rate optimality, under a given privacy budget.
arXiv Detail & Related papers (2024-10-18T18:02:13Z) - Causal Inference with Differentially Private (Clustered) Outcomes [16.166525280886578]
Estimating causal effects from randomized experiments is only feasible if participants agree to reveal their responses.
We suggest a new differential privacy mechanism, Cluster-DP, which leverages any given cluster structure.
We show that, depending on an intuitive measure of cluster quality, we can improve the variance loss while maintaining our privacy guarantees.
arXiv Detail & Related papers (2023-08-02T05:51:57Z) - Summary Statistic Privacy in Data Sharing [23.50797952699759]
We study a setting where a data holder wishes to share data with a receiver, without revealing certain summary statistics of the data distribution.
We propose summary statistic privacy, a metric for quantifying the privacy risk of such a mechanism.
We show that the proposed quantization mechanisms achieve better privacy-distortion tradeoffs than alternative privacy mechanisms.
arXiv Detail & Related papers (2023-03-03T15:29:19Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Private measures, random walks, and synthetic data [7.5764890276775665]
Differential privacy is a mathematical concept that provides an information-theoretic security guarantee.
We develop a private measure from a data set that allows us to efficiently construct private synthetic data.
A key ingredient in our construction is a new superregular random walk, whose joint distribution of steps is as regular as that of independent random variables.
arXiv Detail & Related papers (2022-04-20T00:06:52Z) - Post-processing of Differentially Private Data: A Fairness Perspective [53.29035917495491]
This paper shows that post-processing causes disparate impacts on individuals or groups.
It analyzes two critical settings: the release of differentially private datasets and the use of such private datasets for downstream decisions.
It proposes a novel post-processing mechanism that is (approximately) optimal under different fairness metrics.
arXiv Detail & Related papers (2022-01-24T02:45:03Z) - Causally Constrained Data Synthesis for Private Data Release [36.80484740314504]
Using synthetic data which reflects certain statistical properties of the original data preserves the privacy of the original data.
Prior works utilize differentially private data release mechanisms to provide formal privacy guarantees.
We propose incorporating causal information into the training process to favorably modify the aforementioned trade-off.
arXiv Detail & Related papers (2021-05-27T13:46:57Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.