An Examination of the Alleged Privacy Threats of Confidence-Ranked Reconstruction of Census Microdata
- URL: http://arxiv.org/abs/2311.03171v2
- Date: Tue, 17 Sep 2024 09:49:19 GMT
- Title: An Examination of the Alleged Privacy Threats of Confidence-Ranked Reconstruction of Census Microdata
- Authors: David Sánchez, Najeeb Jebreel, Krishnamurty Muralidhar, Josep Domingo-Ferrer, Alberto Blanco-Justicia,
- Abstract summary: We show that the proposed reconstruction is neither effective as a reconstruction method nor attribute to disclosure as claimed by its authors.
We report empirical results showing the proposed ranking cannot guide reidentification or conducive disclosure attacks.
- Score: 3.2156268397508314
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The threat of reconstruction attacks has led the U.S. Census Bureau (USCB) to replace in the Decennial Census 2020 the traditional statistical disclosure limitation based on rank swapping with one based on differential privacy (DP), leading to substantial accuracy loss of released statistics. Yet, it has been argued that, if many different reconstructions are compatible with the released statistics, most of them do not correspond to actual original data, which protects against respondent reidentification. Recently, a new attack has been proposed, which incorporates the confidence that a reconstructed record was in the original data. The alleged risk of disclosure entailed by such confidence-ranked reconstruction has renewed the interest of the USCB to use DP-based solutions. To forestall a potential accuracy loss in future releases, we show that the proposed reconstruction is neither effective as a reconstruction method nor conducive to disclosure as claimed by its authors. Specifically, we report empirical results showing the proposed ranking cannot guide reidentification or attribute disclosure attacks, and hence fails to warrant the utility sacrifice entailed by the use of DP to release census statistical data.
Related papers
- Demystifying Trajectory Recovery From Ash: An Open-Source Evaluation and Enhancement [5.409124675229009]
This study reimplements the trajectory recovery attack from scratch and evaluates it on two open-source datasets.
Results confirm that privacy leakage still exists despite common anonymisation and aggregation methods.
We propose a stronger attack by designing a series of enhancements to the baseline attack.
arXiv Detail & Related papers (2024-09-23T01:06:41Z) - Quantifying Privacy Risks of Public Statistics to Residents of Subsidized Housing [28.493827954922885]
We show that respondents in subsidized housing may deliberately not mention unauthorized children and other household members for fear of being evicted.
By combining public statistics from the Decennial Census and the Department of Housing and Urban Development, we demonstrate a simple, inexpensive reconstruction attack.
Our results provide a valuable example for policymakers seeking a trustworthy, accurate census.
arXiv Detail & Related papers (2024-07-05T18:00:02Z) - Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information.
We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z) - Visual Privacy Auditing with Diffusion Models [52.866433097406656]
We propose a reconstruction attack based on diffusion models (DMs) that assumes adversary access to real-world image priors.
We show that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as effective auditing tools for visualizing privacy leakage.
arXiv Detail & Related papers (2024-03-12T12:18:55Z) - The 2010 Census Confidentiality Protections Failed, Here's How and Why [6.982581904789855]
We reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records.
Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed.
We show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality.
arXiv Detail & Related papers (2023-12-18T15:23:12Z) - Confidence-Ranked Reconstruction of Census Microdata from Published
Statistics [45.39928315344449]
A reconstruction attack on a private dataset takes as input some publicly accessible information about the dataset.
We show that our attacks can not only reconstruct full rows from the aggregate query statistics $Q(D)Rmm$, but can do so in a way that reliably ranks reconstructed rows by their odds.
Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset $D$ was sampled.
arXiv Detail & Related papers (2022-11-06T14:08:43Z) - No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy" [75.98836424725437]
New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
arXiv Detail & Related papers (2022-09-29T17:50:23Z) - Releasing survey microdata with exact cluster locations and additional
privacy safeguards [77.34726150561087]
We propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards.
Our strategy reduces the respondents' re-identification risk for any number of disclosed attributes by 60-80% even under re-identification attempts.
arXiv Detail & Related papers (2022-05-24T19:37:11Z) - Assessing the risk of re-identification arising from an attack on
anonymised data [0.24466725954625884]
We calculate the risk of re-identification arising from a malicious attack to an anonymised dataset.
We present an analytical means of estimating the probability of re-identification of a single patient in a k-anonymised dataset.
We generalize this solution to obtain the probability of multiple patients being re-identified.
arXiv Detail & Related papers (2022-03-31T09:47:05Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.