Statistical Privacy
- URL: http://arxiv.org/abs/2501.12893v1
- Date: Wed, 22 Jan 2025 14:13:44 GMT
- Title: Statistical Privacy
- Authors: Dennis Breutigam, RĂ¼diger Reischuk,
- Abstract summary: This paper considers a situation where an adversary knows the distribution by which the database is generated, but no exact data of its entries.<n>We analyze in detail how the entropy of the distribution guarantes privacy for a large class of queries called property queries.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To analyze the privacy guarantee of personal data in a database that is subject to queries it is necessary to model the prior knowledge of a possible attacker. Differential privacy considers a worst-case scenario where he knows almost everything, which in many applications is unrealistic and requires a large utility loss. This paper considers a situation called statistical privacy where an adversary knows the distribution by which the database is generated, but no exact data of all (or sufficient many) of its entries. We analyze in detail how the entropy of the distribution guarantes privacy for a large class of queries called property queries. Exact formulas are obtained for the privacy parameters. We analyze how they depend on the probability that an entry fulfills the property under investigation. These formulas turn out to be lengthy, but can be used for tight numerical approximations of the privacy parameters. Such estimations are necessary for applying privacy enhancing techniques in practice. For this statistical setting we further investigate the effect of adding noise or applying subsampling and the privacy utility tradeoff. The dependencies on the parameters are illustrated in detail by a series of plots. Finally, these results are compared to the differential privacy model.
Related papers
- Parallel Composition for Statistical Privacy [0.0]
A privacy mechanism is proposed that is based on subsampling and randomly partitioning the database to bound the dependency among queries.<n>These bounds show that in realistic application scenarios taking the entropy of distributions into account yields improvements of privacy and precision guarantees.
arXiv Detail & Related papers (2026-02-10T10:13:44Z) - Privacy Amplification by Missing Data [4.9024539661445825]
We analyze missing data as a privacy amplification mechanism within the framework of differential privacy.<n>We show, for the first time, that incomplete data can yield privacy amplification for differentially private algorithms.
arXiv Detail & Related papers (2026-02-02T10:28:41Z) - PrivATE: Differentially Private Average Treatment Effect Estimation for Observational Data [49.35645194884526]
We introduce PrivATE, a practical ATE estimation framework that ensures differential privacy.<n>We design two levels (i.e., label-level and sample-level) of privacy protection in PrivATE to accommodate different privacy requirements.<n>PrivATE effectively balances noise-induced error and matching error, leading to a more accurate estimate of ATE.
arXiv Detail & Related papers (2025-12-16T16:30:07Z) - High-Dimensional Asymptotics of Differentially Private PCA [4.168157981135696]
In differential privacy, statistics of a sensitive dataset are privatized by introducing random noise.<n>It remains unclear if such high noise levels are truly necessary or a limitation of the proof techniques.<n>This paper explores whether we can obtain sharp privacy characterizations that identify the smallest noise level required to achieve a target privacy level.
arXiv Detail & Related papers (2025-11-10T16:17:16Z) - Differentially Private Synthetic Data Release for Topics API Outputs [63.79476766779742]
We focus on one Privacy-Preserving Ads API: the Topics API, part of Google Chrome's Privacy Sandbox.<n>We generate a differentially-private dataset that closely matches the re-identification risk properties of the real Topics API data.<n>We hope this will enable external researchers to analyze the API in-depth and replicate prior and future work on a realistic large-scale dataset.
arXiv Detail & Related papers (2025-06-30T13:46:57Z) - Improving Statistical Privacy by Subsampling [0.0]
A privacy mechanism often used is to take samples of the data for answering a query.<n>This paper proves precise bounds how much different methods of sampling increase privacy in the statistical setting.<n>For the DP setting tradeoff functions have been proposed as a finer measure for privacy compared to (epsilon,delta)-pairs.
arXiv Detail & Related papers (2025-04-15T17:40:45Z) - Mean Estimation Under Heterogeneous Privacy Demands [5.755004576310333]
This work considers the problem of mean estimation, where each user can impose their own privacy level.
The algorithm we propose is shown to be minimax optimal and has a near-linear run-time.
Users with less but differing privacy requirements are all given more privacy than they require, in equal amounts.
arXiv Detail & Related papers (2023-10-19T20:29:19Z) - Privately Answering Queries on Skewed Data via Per Record Differential Privacy [8.376475518184883]
We propose a privacy formalism, per-record zero concentrated differential privacy (PzCDP)<n>Unlike other formalisms which provide different privacy losses to different records, PRzCDP's privacy loss depends explicitly on the confidential data.
arXiv Detail & Related papers (2023-10-19T15:24:49Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Decision Making with Differential Privacy under a Fairness Lens [65.16089054531395]
The U.S. Census Bureau releases data sets and statistics about groups of individuals that are used as input to a number of critical decision processes.
To conform to privacy and confidentiality requirements, these agencies are often required to release privacy-preserving versions of the data.
This paper studies the release of differentially private data sets and analyzes their impact on some critical resource allocation tasks under a fairness perspective.
arXiv Detail & Related papers (2021-05-16T21:04:19Z) - Robust and Differentially Private Mean Estimation [40.323756738056616]
Differential privacy has emerged as a standard requirement in a variety of applications ranging from the U.S. Census to data collected in commercial devices.
An increasing number of such databases consist of data from multiple sources, not all of which can be trusted.
This leaves existing private analyses vulnerable to attacks by an adversary who injects corrupted data.
arXiv Detail & Related papers (2021-02-18T05:02:49Z) - Towards practical differentially private causal graph discovery [74.7791110594082]
Causal graph discovery refers to the process of discovering causal relation graphs from purely observational data.
We present a differentially private causal graph discovery algorithm, Priv-PC, which improves both utility and running time compared to the state-of-the-art.
arXiv Detail & Related papers (2020-06-15T18:30:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.