Related papers: An Examination of the Alleged Privacy Threats of Confidence-Ranked Reconstruction of Census Microdata

An Examination of the Alleged Privacy Threats of Confidence-Ranked Reconstruction of Census Microdata

URL: http://arxiv.org/abs/2311.03171v1
Date: Mon, 6 Nov 2023 15:04:03 GMT
Title: An Examination of the Alleged Privacy Threats of Confidence-Ranked Reconstruction of Census Microdata
Authors: David S\'anchez, Najeeb Jebreel, Josep Domingo-Ferrer, Krishnamurty Muralidhar, and Alberto Blanco-Justicia
Abstract summary: We show in this paper that the proposed confidence-ranked reconstruction does not threaten privacy. We also demonstrate that, due to the way the Census data are compiled, processed and released, it is not possible to reconstruct original and complete records.
Score: 2.842800539489865
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The alleged threat of reconstruction attacks has led the U.S. Census Bureau (USCB) to replace in the Decennial Census 2020 the traditional statistical disclosure limitation based on rank swapping with one based on differential privacy (DP). This has resulted in substantial accuracy loss of the released statistics. Worse yet, it has been shown that the reconstruction attacks used as an argument to move to DP are very far from allowing unequivocal reidentification of the respondents, because in general there are a lot of reconstructions compatible with the released statistics. In a very recent paper, a new reconstruction attack has been proposed, whose goal is to indicate the confidence that a reconstructed record was in the original respondent data. The alleged risk of serious disclosure entailed by such confidence-ranked reconstruction has renewed the interest of the USCB to use DP-based solutions. To forestall the potential accuracy loss in future data releases resulting from adoption of these solutions, we show in this paper that the proposed confidence-ranked reconstruction does not threaten privacy. Specifically, we report empirical results showing that the proposed ranking cannot guide reidentification or attribute disclosure attacks, and hence it fails to warrant the USCB's move towards DP. Further, we also demonstrate that, due to the way the Census data are compiled, processed and released, it is not possible to reconstruct original and complete records through any methodology, and the confidence-ranked reconstruction not only is completely ineffective at accurately reconstructing Census records but is trivially outperformed by an adequate interpretation of the released aggregate statistics.

Related papers

Benchmarking Fraud Detectors on Private Graph Data [70.4654745317714]
Currently, many types of fraud are managed in part by automated detection algorithms that operate over graphs.<n>We consider the scenario where a data holder wishes to outsource development of fraud detectors to third parties.<n>Third parties submit their fraud detectors to the data holder, who evaluates these algorithms on a private dataset and then publicly communicates the results.<n>We propose a realistic privacy attack on this system that allows an adversary to de-anonymize individuals' data based only on the evaluation results.
arXiv Detail & Related papers (2025-07-30T03:20:15Z)
Unifying Re-Identification, Attribute Inference, and Data Reconstruction Risks in Differential Privacy [18.92793740861912]
We show that bounds on attack success can take the same unified form across re-identification, attribute inference, and data reconstruction risks.<n>Our results are tighter than prior methods using $varepsilon$-DP, R'enyi DP, and concentrated DP.
arXiv Detail & Related papers (2025-07-09T15:59:30Z)
Demystifying Trajectory Recovery From Ash: An Open-Source Evaluation and Enhancement [5.409124675229009]
This study reimplements the trajectory recovery attack from scratch and evaluates it on two open-source datasets. Results confirm that privacy leakage still exists despite common anonymisation and aggregation methods. We propose a stronger attack by designing a series of enhancements to the baseline attack.
arXiv Detail & Related papers (2024-09-23T01:06:41Z)
Quantifying Privacy Risks of Public Statistics to Residents of Subsidized Housing [28.493827954922885]
We show that respondents in subsidized housing may deliberately not mention unauthorized children and other household members for fear of being evicted. By combining public statistics from the Decennial Census and the Department of Housing and Urban Development, we demonstrate a simple, inexpensive reconstruction attack. Our results provide a valuable example for policymakers seeking a trustworthy, accurate census.
arXiv Detail & Related papers (2024-07-05T18:00:02Z)
Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information. We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z)
Visual Privacy Auditing with Diffusion Models [52.866433097406656]
We propose a reconstruction attack based on diffusion models (DMs) that assumes adversary access to real-world image priors. We show that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as effective auditing tools for visualizing privacy leakage.
arXiv Detail & Related papers (2024-03-12T12:18:55Z)
The 2010 Census Confidentiality Protections Failed, Here's How and Why [6.982581904789855]
We reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. We show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality.
arXiv Detail & Related papers (2023-12-18T15:23:12Z)
Confidence-Ranked Reconstruction of Census Microdata from Published Statistics [45.39928315344449]
A reconstruction attack on a private dataset takes as input some publicly accessible information about the dataset. We show that our attacks can not only reconstruct full rows from the aggregate query statistics $Q(D)Rmm$, but can do so in a way that reliably ranks reconstructed rows by their odds. Our attacks significantly outperform those that are based only on access to a public distribution or population from which the private dataset $D$ was sampled.
arXiv Detail & Related papers (2022-11-06T14:08:43Z)
No Free Lunch in "Privacy for Free: How does Dataset Condensation Help Privacy" [75.98836424725437]
New methods designed to preserve data privacy require careful scrutiny. Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
arXiv Detail & Related papers (2022-09-29T17:50:23Z)
Releasing survey microdata with exact cluster locations and additional privacy safeguards [77.34726150561087]
We propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards. Our strategy reduces the respondents' re-identification risk for any number of disclosed attributes by 60-80% even under re-identification attempts.
arXiv Detail & Related papers (2022-05-24T19:37:11Z)
Assessing the risk of re-identification arising from an attack on anonymised data [0.24466725954625884]
We calculate the risk of re-identification arising from a malicious attack to an anonymised dataset. We present an analytical means of estimating the probability of re-identification of a single patient in a k-anonymised dataset. We generalize this solution to obtain the probability of multiple patients being re-identified.
arXiv Detail & Related papers (2022-03-31T09:47:05Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.