Comment: The Essential Role of Policy Evaluation for the 2020 Census
Disclosure Avoidance System
- URL: http://arxiv.org/abs/2210.08383v1
- Date: Sat, 15 Oct 2022 21:41:54 GMT
- Title: Comment: The Essential Role of Policy Evaluation for the 2020 Census
Disclosure Avoidance System
- Authors: Christopher T. Kenny, Shiro Kuriwaki, Cory McCartan, Evan T. R.
Rosenman, Tyler Simko, Kosuke Imai
- Abstract summary: boyd and Sarathy, "Differential Perspectives: Epistemic Disconnects Surrounding the US Census Bureau's Use of Differential Privacy"
We argue that empirical evaluations of the Census Disclosure Avoidance System failed to recognize how the benchmark data is never a ground truth of population counts.
We argue that policy makers must confront a key trade-off between data utility and privacy protection.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In "Differential Perspectives: Epistemic Disconnects Surrounding the US
Census Bureau's Use of Differential Privacy," boyd and Sarathy argue that
empirical evaluations of the Census Disclosure Avoidance System (DAS),
including our published analysis, failed to recognize how the benchmark data
against which the 2020 DAS was evaluated is never a ground truth of population
counts. In this commentary, we explain why policy evaluation, which was the
main goal of our analysis, is still meaningful without access to a perfect
ground truth. We also point out that our evaluation leveraged features specific
to the decennial Census and redistricting data, such as block-level population
invariance under swapping and voter file racial identification, better
approximating a comparison with the ground truth. Lastly, we show that accurate
statistical predictions of individual race based on the Bayesian Improved
Surname Geocoding, while not a violation of differential privacy, substantially
increases the disclosure risk of private information the Census Bureau sought
to protect. We conclude by arguing that policy makers must confront a key
trade-off between data utility and privacy protection, and an epistemic
disconnect alone is insufficient to explain disagreements between policy
choices.
Related papers
- The 2020 United States Decennial Census Is More Private Than You (Might) Think [25.32778927275117]
We show that between 8.50% and 13.76% of the privacy budget for the 2020 U.S. Census remains unused for each of the eight geographical levels.
We mitigate noise variances by 15.08% to 24.82% while maintaining the same privacy budget for each geographical level.
arXiv Detail & Related papers (2024-10-11T23:06:15Z) - Understanding and Mitigating the Impacts of Differentially Private Census Data on State Level Redistricting [4.589972411795548]
Data users were shaken by the adoption of differential privacy in the 2020 DAS.
We consider two redistricting settings in which a data user might be concerned about the impacts of privacy preserving noise.
We observe that an analyst may come to incorrect conclusions if they do not account for noise.
arXiv Detail & Related papers (2024-09-10T18:11:54Z) - An In-Depth Examination of Requirements for Disclosure Risk Assessment [6.0631983658449435]
We argue that any proposal for quantifying disclosure risk should be based on pre-specified, objective criteria.
We illustrate this approach, using simple desiderata, to evaluate the absolute disclosure risk framework.
We conclude that satisfying all the desiderata is impossible, but counterfactual comparisons satisfy the most.
arXiv Detail & Related papers (2023-10-13T20:36:29Z) - Estimating Racial Disparities When Race is Not Observed [3.0931877196387196]
We introduce a new class of models that produce racial disparity estimates by using surnames as an instrumental variable for race.
A validation study based on the North Carolina voter file shows that BISG+BIRDiE reduces error by up to 84% when estimating racial differences in party registration.
We apply the proposed methodology to estimate racial differences in who benefits from the home mortgage interest deduction using individual-level tax data from the U.S. Internal Revenue Service.
arXiv Detail & Related papers (2023-03-05T04:46:16Z) - No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy" [75.98836424725437]
New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
arXiv Detail & Related papers (2022-09-29T17:50:23Z) - Identification of Subgroups With Similar Benefits in Off-Policy Policy
Evaluation [60.71312668265873]
We develop a method to balance the need for personalization with confident predictions.
We show that our method can be used to form accurate predictions of heterogeneous treatment effects.
arXiv Detail & Related papers (2021-11-28T23:19:12Z) - The Impact of the U.S. Census Disclosure Avoidance System on
Redistricting and Voting Rights Analysis [0.0]
The US Census Bureau plans to protect the privacy of 2020 Census respondents through its Disclosure Avoidance System (DAS)
We find that the protected data are not of sufficient quality for redistricting purposes.
Our analysis finds that the DAS-protected data are biased against certain areas, depending on voter turnout and partisan and racial composition.
arXiv Detail & Related papers (2021-05-29T03:32:36Z) - Offline Policy Selection under Uncertainty [113.57441913299868]
We consider offline policy selection as learning preferences over a set of policy prospects given a fixed experience dataset.
Access to the full distribution over one's belief of the policy value enables more flexible selection algorithms under a wider range of downstream evaluation metrics.
We show how BayesDICE may be used to rank policies with respect to any arbitrary downstream policy selection metric.
arXiv Detail & Related papers (2020-12-12T23:09:21Z) - Magnify Your Population: Statistical Downscaling to Augment the Spatial
Resolution of Socioeconomic Census Data [48.7576911714538]
We present a new statistical downscaling approach to derive fine-scale estimates of key socioeconomic attributes.
For each selected socioeconomic variable, a Random Forest model is trained on the source Census units and then used to generate fine-scale gridded predictions.
As a case study, we apply this method to Census data in the United States, downscaling the selected socioeconomic variables available at the block group level, to a grid of 300 spatial resolution.
arXiv Detail & Related papers (2020-06-23T16:52:18Z) - Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic
Policies [80.42316902296832]
We study the estimation of policy value and gradient of a deterministic policy from off-policy data when actions are continuous.
In this setting, standard importance sampling and doubly robust estimators for policy value and gradient fail because the density ratio does not exist.
We propose several new doubly robust estimators based on different kernelization approaches.
arXiv Detail & Related papers (2020-06-06T15:52:05Z) - Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement
Learning [70.01650994156797]
Off- evaluation of sequential decision policies from observational data is necessary in batch reinforcement learning such as education healthcare.
We develop an approach that estimates the bounds of a given policy.
We prove convergence to the sharp bounds as we collect more confounded data.
arXiv Detail & Related papers (2020-02-11T16:18:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.