The Complexities of Differential Privacy for Survey Data
- URL: http://arxiv.org/abs/2408.07006v1
- Date: Tue, 13 Aug 2024 16:15:42 GMT
- Title: The Complexities of Differential Privacy for Survey Data
- Authors: Jörg Drechsler, James Bailie,
- Abstract summary: The U.S. Census Bureau announced the adoption of the concept for its 2020 Decennial Census.
Despite its attractive theoretical properties, implementing DP in practice remains challenging, especially when it comes to survey data.
We identify five aspects that need to be considered when adopting DP in the survey context.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The concept of differential privacy (DP) has gained substantial attention in recent years, most notably since the U.S. Census Bureau announced the adoption of the concept for its 2020 Decennial Census. However, despite its attractive theoretical properties, implementing DP in practice remains challenging, especially when it comes to survey data. In this paper we present some results from an ongoing project funded by the U.S. Census Bureau that is exploring the possibilities and limitations of DP for survey data. Specifically, we identify five aspects that need to be considered when adopting DP in the survey context: the multi-staged nature of data production; the limited privacy amplification from complex sampling designs; the implications of survey-weighted estimates; the weighting adjustments for nonresponse and other data deficiencies, and the imputation of missing values. We summarize the project's key findings with respect to each of these aspects and also discuss some of the challenges that still need to be addressed before DP could become the new data protection standard at statistical agencies.
Related papers
- A Decade of Metric Differential Privacy: Advancements and Applications [8.865292595200964]
Metric Differential Privacy (mDP) builds upon the core principles of Differential Privacy (DP) by incorporating various distance metrics.
mDP offers privacy guarantees for a wide range of applications, such as location-based services, text analysis, and image processing.
This paper provides a comprehensive survey of mDP research from 2013 to 2024, tracing its development from the foundations of DP.
arXiv Detail & Related papers (2025-02-13T05:18:24Z) - Evaluating Differential Privacy on Correlated Datasets Using Pointwise Maximal Leakage [38.4830633082184]
Data-driven advancements pose substantial risks to privacy.
differential privacy has become a cornerstone in privacy preservation efforts.
Our work aims to foster a deeper understanding of subtle privacy risks and highlight the need for the development of more effective privacy-preserving mechanisms.
arXiv Detail & Related papers (2025-02-08T10:30:45Z) - Differentially Private Finite Population Estimation via Survey Weight Regularization [0.8192907805418583]
We develop a differentially private method for estimating finite population quantities.
We show that optimal strategies for releasing DP survey-weighted mean income estimates require orders-of-magnitude less noise than naively.
arXiv Detail & Related papers (2024-11-06T20:04:22Z) - Differentially Private Data Release on Graphs: Inefficiencies and Unfairness [48.96399034594329]
This paper characterizes the impact of Differential Privacy on bias and unfairness in the context of releasing information about networks.
We consider a network release problem where the network structure is known to all, but the weights on edges must be released privately.
Our work provides theoretical foundations and empirical evidence into the bias and unfairness arising due to privacy in these networked decision problems.
arXiv Detail & Related papers (2024-08-08T08:37:37Z) - Synthetic Census Data Generation via Multidimensional Multiset Sum [7.900694093691988]
We provide tools to generate synthetic microdata solely from published Census statistics.
We show that our methods work well in practice, and we offer theoretical arguments to explain our performance.
arXiv Detail & Related papers (2024-04-15T19:06:37Z) - Federated Experiment Design under Distributed Differential Privacy [31.06808163362162]
We focus on the rigorous protection of users' privacy while minimizing the trust toward service providers.
Although a vital component in modern A/B testing, private distributed experimentation has not previously been studied.
We show how these mechanisms can be scaled up to handle the very large number of participants commonly found in practice.
arXiv Detail & Related papers (2023-11-07T22:38:56Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - The Impact of De-Identification on Single-Year-of-Age Counts in the U.S.
Census [1.6114012813668932]
In 2020, the U.S. Census Bureau transitioned from data swapping to differential privacy (DP) in its approach to de-identifying decennial census data.
We compare the relative impacts of swapping and DP on census data, focusing on the use case of school planning.
Our findings support the use of DP over swapping for single-year-of-age counts.
For the school planning use cases we investigate, DP provides comparable, if not improved, accuracy over swapping, while offering other benefits such as improved transparency.
arXiv Detail & Related papers (2023-08-24T15:56:05Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Comment: The Essential Role of Policy Evaluation for the 2020 Census
Disclosure Avoidance System [0.0]
boyd and Sarathy, "Differential Perspectives: Epistemic Disconnects Surrounding the US Census Bureau's Use of Differential Privacy"
We argue that empirical evaluations of the Census Disclosure Avoidance System failed to recognize how the benchmark data is never a ground truth of population counts.
We argue that policy makers must confront a key trade-off between data utility and privacy protection.
arXiv Detail & Related papers (2022-10-15T21:41:54Z) - Post-processing of Differentially Private Data: A Fairness Perspective [53.29035917495491]
This paper shows that post-processing causes disparate impacts on individuals or groups.
It analyzes two critical settings: the release of differentially private datasets and the use of such private datasets for downstream decisions.
It proposes a novel post-processing mechanism that is (approximately) optimal under different fairness metrics.
arXiv Detail & Related papers (2022-01-24T02:45:03Z) - Decision Making with Differential Privacy under a Fairness Lens [65.16089054531395]
The U.S. Census Bureau releases data sets and statistics about groups of individuals that are used as input to a number of critical decision processes.
To conform to privacy and confidentiality requirements, these agencies are often required to release privacy-preserving versions of the data.
This paper studies the release of differentially private data sets and analyzes their impact on some critical resource allocation tasks under a fairness perspective.
arXiv Detail & Related papers (2021-05-16T21:04:19Z) - Differential Privacy of Hierarchical Census Data: An Optimization
Approach [53.29035917495491]
Census Bureaus are interested in releasing aggregate socio-economic data about a large population without revealing sensitive information about any individual.
Recent events have identified some of the privacy challenges faced by these organizations.
This paper presents a novel differential-privacy mechanism for releasing hierarchical counts of individuals.
arXiv Detail & Related papers (2020-06-28T18:19:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.