Formal Privacy Guarantees with Invariant Statistics
- URL: http://arxiv.org/abs/2410.17468v1
- Date: Tue, 22 Oct 2024 22:50:17 GMT
- Title: Formal Privacy Guarantees with Invariant Statistics
- Authors: Young Hyun Cho, Jordan Awan,
- Abstract summary: Motivated by the 2020 US Census products, this paper extends differential privacy (DP) to address the joint release of DP outputs and nonprivate statistics.
Our framework, Semi-DP, redefines adjacency by focusing on datasets that conform to the given invariant.
We provide a privacy analysis of the 2020 US Decennial Census using the Semi-DP framework, revealing that the effective privacy guarantees are weaker than advertised.
- Score: 8.133739801185271
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Motivated by the 2020 US Census products, this paper extends differential privacy (DP) to address the joint release of DP outputs and nonprivate statistics, referred to as invariant. Our framework, Semi-DP, redefines adjacency by focusing on datasets that conform to the given invariant, ensuring indistinguishability between adjacent datasets within invariant-conforming datasets. We further develop customized mechanisms that satisfy Semi-DP, including the Gaussian mechanism and the optimal $K$-norm mechanism for rank-deficient sensitivity spaces. Our framework is applied to contingency table analysis which is relevant to the 2020 US Census, illustrating how Semi-DP enables the release of private outputs given the one-way margins as the invariant. Additionally, we provide a privacy analysis of the 2020 US Decennial Census using the Semi-DP framework, revealing that the effective privacy guarantees are weaker than advertised.
Related papers
- Fundamental Limit of Discrete Distribution Estimation under Utility-Optimized Local Differential Privacy [14.980778567896593]
We study the problem of discrete distribution estimation under utility-optimized local differential privacy (ULDP)<n>For the achievability, we propose a class of utility-optimized block design (uBD) schemes, obtained as non-preserving modifications of the block design mechanism known to be optimal under standard LDP constraints.<n>These results provide a tight characterization of the estimation accuracy achievable under ULDP and reveal new insights into the structure of optimal mechanisms for privacy-trivial statistical inference.
arXiv Detail & Related papers (2025-09-29T01:41:36Z) - Machine Learning with Privacy for Protected Attributes [56.44253915927481]
We refine the definition of differential privacy (DP) to create a more general and flexible framework that we call feature differential privacy (FDP)<n>Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary separation of protected and non-protected features.<n>We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available.
arXiv Detail & Related papers (2025-06-24T17:53:28Z) - A Refreshment Stirred, Not Shaken (II): Invariant-Preserving Deployments of Differential Privacy for the US Decennial Census [4.540236408836132]
We develop a statistical control (SDC) method for the U.S. Decennial Census.
We show that the PSA algorithm induces the invariant $varepsilon$s which can be reconciled with differential privacy (DP)
We show that while our results explicate the guarantees of SDC provided by the PSA, the DAS and the 2020 DAS must be taken in general to actual privacy protection $x2013$ just as is the case for any deployment.
arXiv Detail & Related papers (2025-01-14T21:38:01Z) - Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms.
Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Unified Mechanism-Specific Amplification by Subsampling and Group Privacy Amplification [54.1447806347273]
Amplification by subsampling is one of the main primitives in machine learning with differential privacy.
We propose the first general framework for deriving mechanism-specific guarantees.
We analyze how subsampling affects the privacy of groups of multiple users.
arXiv Detail & Related papers (2024-03-07T19:36:05Z) - Disclosure Avoidance for the 2020 Census Demographic and Housing Characteristics File [7.664548801662584]
We describe the concepts and methods used by the Disclosure Avoidance System to produce formally private output in support of the 2020 Census statistical data product releases.<n>We describe the updates to the DAS that were required to release the Demographic and Housing Characteristics (DHC) File.<n>We also describe the final configuration parameters used for the 2020 production DHC DAS implementation, error metrics for these production statistical data products, and plans for future experimental data products.
arXiv Detail & Related papers (2023-12-18T00:54:04Z) - Bounded and Unbiased Composite Differential Privacy [25.427802467876248]
The objective of differential privacy (DP) is to protect privacy by producing an output distribution that is indistinguishable between two neighboring databases.
Existing solutions attempt to address this issue by employing post-processing or truncation techniques.
We propose a novel differentially private mechanism which uses a composite probability density function to generate bounded and unbiased outputs.
arXiv Detail & Related papers (2023-11-04T04:43:47Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - The Poisson binomial mechanism for secure and private federated learning [19.399122892615573]
We introduce a discrete differential privacy mechanism for distributed mean estimation (DME) with applications to federated learning and analytics.
We provide a tight analysis of its privacy guarantees, showing that it achieves the same privacy-accuracy trade-offs as the continuous Gaussian mechanism.
arXiv Detail & Related papers (2022-07-09T05:46:28Z) - Post-processing of Differentially Private Data: A Fairness Perspective [53.29035917495491]
This paper shows that post-processing causes disparate impacts on individuals or groups.
It analyzes two critical settings: the release of differentially private datasets and the use of such private datasets for downstream decisions.
It proposes a novel post-processing mechanism that is (approximately) optimal under different fairness metrics.
arXiv Detail & Related papers (2022-01-24T02:45:03Z) - Propose, Test, Release: Differentially private estimation with high
probability [9.25177374431812]
We introduce a new general version of the PTR mechanism that allows us to derive high probability error bounds for differentially private estimators.
Our algorithms provide the first statistical guarantees for differentially private estimation of the median and mean without any boundedness assumptions on the data.
arXiv Detail & Related papers (2020-02-19T01:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.