The Impact of De-Identification on Single-Year-of-Age Counts in the U.S.
Census
- URL: http://arxiv.org/abs/2308.12876v1
- Date: Thu, 24 Aug 2023 15:56:05 GMT
- Title: The Impact of De-Identification on Single-Year-of-Age Counts in the U.S.
Census
- Authors: Sarah Radway and Miranda Christ
- Abstract summary: In 2020, the U.S. Census Bureau transitioned from data swapping to differential privacy (DP) in its approach to de-identifying decennial census data.
We compare the relative impacts of swapping and DP on census data, focusing on the use case of school planning.
Our findings support the use of DP over swapping for single-year-of-age counts.
For the school planning use cases we investigate, DP provides comparable, if not improved, accuracy over swapping, while offering other benefits such as improved transparency.
- Score: 1.6114012813668932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In 2020, the U.S. Census Bureau transitioned from data swapping to
differential privacy (DP) in its approach to de-identifying decennial census
data. This decision has faced considerable criticism from data users,
particularly due to concerns about the accuracy of DP. We compare the relative
impacts of swapping and DP on census data, focusing on the use case of school
planning, where single-year-of-age population counts (i.e., the number of
four-year-olds in the district) are used to estimate the number of incoming
students and make resulting decisions surrounding faculty, classrooms, and
funding requests. We examine these impacts for school districts of varying
population sizes and age distributions.
Our findings support the use of DP over swapping for single-year-of-age
counts; in particular, concerning behaviors associated with DP (namely, poor
behavior for smaller districts) occur with swapping mechanisms as well. For the
school planning use cases we investigate, DP provides comparable, if not
improved, accuracy over swapping, while offering other benefits such as
improved transparency.
Related papers
- Evaluating the Impacts of Swapping on the US Decennial Census [7.020785266789317]
We describe and implement a parameterized swapping algorithm based on Census publications, court documents, and informal interviews with Census employees.
We provide intuition for the types of shifts induced by swapping and compare against those introduced by TopDown.
arXiv Detail & Related papers (2025-02-03T12:51:16Z) - The Complexities of Differential Privacy for Survey Data [0.0]
The U.S. Census Bureau announced the adoption of the concept for its 2020 Decennial Census.
Despite its attractive theoretical properties, implementing DP in practice remains challenging, especially when it comes to survey data.
We identify five aspects that need to be considered when adopting DP in the survey context.
arXiv Detail & Related papers (2024-08-13T16:15:42Z) - Differentially Private Data Release on Graphs: Inefficiencies and Unfairness [48.96399034594329]
This paper characterizes the impact of Differential Privacy on bias and unfairness in the context of releasing information about networks.
We consider a network release problem where the network structure is known to all, but the weights on edges must be released privately.
Our work provides theoretical foundations and empirical evidence into the bias and unfairness arising due to privacy in these networked decision problems.
arXiv Detail & Related papers (2024-08-08T08:37:37Z) - Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced
Transfer Learning [66.20311762506702]
dataset pruning (DP) has emerged as an effective way to improve data efficiency.
We propose two new DP methods, label mapping and feature mapping, for supervised and self-supervised pretraining settings.
We show that source data classes can be pruned by up to 40% 80% without sacrificing downstream performance.
arXiv Detail & Related papers (2023-10-13T00:07:49Z) - Impacts of Differential Privacy on Fostering more Racially and
Ethnically Diverse Elementary Schools [18.35063779220618]
The U.S. Census Bureau has adopted differential privacy, the de facto standard of privacy protection for the 2020 Census release.
This change has the potential to impact policy decisions like political redistricting and other high-stakes practices.
One under-explored yet important application of such data is the redrawing of school attendance boundaries to foster less demographically segregated schools.
arXiv Detail & Related papers (2023-05-12T21:06:15Z) - Mapping Urban Population Growth from Sentinel-2 MSI and Census Data
Using Deep Learning: A Case Study in Kigali, Rwanda [0.19116784879310023]
We evaluate how deep learning change detection techniques can unravel temporal population dynamics short intervals.
A ResNet encoder, pretrained on a population mapping task with Sentinel-2 MSI data, was incorporated into a Siamese network.
The network was trained at the census level to accurately predict population change.
arXiv Detail & Related papers (2023-03-15T10:39:31Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - DP-SGD vs PATE: Which Has Less Disparate Impact on GANs? [0.0]
We compare GANs trained with the two best-known DP frameworks for deep learning, DP-SGD, and PATE, in different data imbalance settings.
Our experiments consistently show that for PATE, unlike DP-SGD, the privacy-utility trade-off is not monotonically decreasing.
arXiv Detail & Related papers (2021-11-26T17:25:46Z) - DP-SGD vs PATE: Which Has Less Disparate Impact on Model Accuracy? [1.3238373064156095]
We show that application of differential privacy, specifically the DP-SGD algorithm, has a disparate impact on different sub-groups in the population.
We compare PATE, another mechanism for training deep learning models using differential privacy, with DP-SGD in terms of fairness.
arXiv Detail & Related papers (2021-06-22T20:37:12Z) - Decision Making with Differential Privacy under a Fairness Lens [65.16089054531395]
The U.S. Census Bureau releases data sets and statistics about groups of individuals that are used as input to a number of critical decision processes.
To conform to privacy and confidentiality requirements, these agencies are often required to release privacy-preserving versions of the data.
This paper studies the release of differentially private data sets and analyzes their impact on some critical resource allocation tasks under a fairness perspective.
arXiv Detail & Related papers (2021-05-16T21:04:19Z) - Voting-based Approaches For Differentially Private Federated Learning [87.2255217230752]
This work is inspired by knowledge transfer non-federated privacy learning from Papernot et al.
We design two new DPFL schemes, by voting among the data labels returned from each local model, instead of averaging the gradients.
Our approaches significantly improve the privacy-utility trade-off over the state-of-the-arts in DPFL.
arXiv Detail & Related papers (2020-10-09T23:55:19Z) - Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data [48.7576911714538]
We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes.
For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions.
As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
arXiv Detail & Related papers (2020-06-23T16:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.