An Examination of the Alleged Privacy Threats of Confidence-Ranked
Reconstruction of Census Microdata
- URL: http://arxiv.org/abs/2311.03171v1
- Date: Mon, 6 Nov 2023 15:04:03 GMT
- Title: An Examination of the Alleged Privacy Threats of Confidence-Ranked
Reconstruction of Census Microdata
- Authors: David S\'anchez, Najeeb Jebreel, Josep Domingo-Ferrer, Krishnamurty
Muralidhar, and Alberto Blanco-Justicia
- Abstract summary: We show in this paper that the proposed confidence-ranked reconstruction does not threaten privacy.
We also demonstrate that, due to the way the Census data are compiled, processed and released, it is not possible to reconstruct original and complete records.
- Score: 2.842800539489865
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The alleged threat of reconstruction attacks has led the U.S. Census Bureau
(USCB) to replace in the Decennial Census 2020 the traditional statistical
disclosure limitation based on rank swapping with one based on differential
privacy (DP). This has resulted in substantial accuracy loss of the released
statistics. Worse yet, it has been shown that the reconstruction attacks used
as an argument to move to DP are very far from allowing unequivocal
reidentification of the respondents, because in general there are a lot of
reconstructions compatible with the released statistics. In a very recent
paper, a new reconstruction attack has been proposed, whose goal is to indicate
the confidence that a reconstructed record was in the original respondent data.
The alleged risk of serious disclosure entailed by such confidence-ranked
reconstruction has renewed the interest of the USCB to use DP-based solutions.
To forestall the potential accuracy loss in future data releases resulting from
adoption of these solutions, we show in this paper that the proposed
confidence-ranked reconstruction does not threaten privacy. Specifically, we
report empirical results showing that the proposed ranking cannot guide
reidentification or attribute disclosure attacks, and hence it fails to warrant
the USCB's move towards DP. Further, we also demonstrate that, due to the way
the Census data are compiled, processed and released, it is not possible to
reconstruct original and complete records through any methodology, and the
confidence-ranked reconstruction not only is completely ineffective at
accurately reconstructing Census records but is trivially outperformed by an
adequate interpretation of the released aggregate statistics.
Related papers
- Quantifying Privacy Risks of Public Statistics to Residents of Subsidized Housing [28.493827954922885]
We show that respondents in subsidized housing may deliberately not mention unauthorized children and other household members for fear of being evicted.
By combining public statistics from the Decennial Census and the Department of Housing and Urban Development, we demonstrate a simple, inexpensive reconstruction attack.
Our results provide a valuable example for policymakers seeking a trustworthy, accurate census.
arXiv Detail & Related papers (2024-07-05T18:00:02Z) - Bayes' capacity as a measure for reconstruction attacks in federated learning [10.466570297146953]
We formalise the reconstruction threat model using the information-theoretic framework of quantitative information flow.
We show that the Bayes' capacity, related to the Sibson mutual information of order infinity, represents a tight upper bound on the leakage of the DP-SGD algorithm to an adversary.
arXiv Detail & Related papers (2024-06-19T13:58:42Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Visual Privacy Auditing with Diffusion Models [52.866433097406656]
We propose a reconstruction attack based on diffusion models (DMs) that assumes adversary access to real-world image priors.
We show that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as effective auditing tools for visualizing privacy leakage.
arXiv Detail & Related papers (2024-03-12T12:18:55Z) - The 2010 Census Confidentiality Protections Failed, Here's How and Why [6.982581904789855]
We reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records.
Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed.
We show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality.
arXiv Detail & Related papers (2023-12-18T15:23:12Z) - Bounding data reconstruction attacks with the hypothesis testing
interpretation of differential privacy [78.32404878825845]
Reconstruction Robustness (ReRo) was recently proposed as an upper bound on the success of data reconstruction attacks against machine learning models.
Previous research has demonstrated that differential privacy (DP) mechanisms also provide ReRo, but so far, only Monte Carlo estimates of a tight ReRo bound have been shown.
arXiv Detail & Related papers (2023-07-08T08:02:47Z) - Confidence-Ranked Reconstruction of Census Microdata from Published
Statistics [45.39928315344449]
A reconstruction attack on a private dataset takes as input some publicly accessible information about the dataset.
We show that our attacks can not only reconstruct full rows from the aggregate query statistics $Q(D)Rmm$, but can do so in a way that reliably ranks reconstructed rows by their odds.
Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset $D$ was sampled.
arXiv Detail & Related papers (2022-11-06T14:08:43Z) - No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy" [75.98836424725437]
New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
arXiv Detail & Related papers (2022-09-29T17:50:23Z) - Releasing survey microdata with exact cluster locations and additional
privacy safeguards [77.34726150561087]
We propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards.
Our strategy reduces the respondents' re-identification risk for any number of disclosed attributes by 60-80% even under re-identification attempts.
arXiv Detail & Related papers (2022-05-24T19:37:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.