Practical and Ready-to-Use Methodology to Assess the re-identification Risk in Anonymized Datasets
- URL: http://arxiv.org/abs/2501.10841v1
- Date: Sat, 18 Jan 2025 18:22:27 GMT
- Title: Practical and Ready-to-Use Methodology to Assess the re-identification Risk in Anonymized Datasets
- Authors: Louis-Philippe Sondeck, Maryline Laurent,
- Abstract summary: This paper proposes a practical and ready-to-use methodology for re-identification risk assessment.
It is the first to follow well-known risk analysis methods (e.g. EBIOS) that have been used in the cybersecurity field for years.
- Score: 1.4732811715354455
- License:
- Abstract: To prove that a dataset is sufficiently anonymized, many privacy policies suggest that a re-identification risk assessment be performed, but do not provide a precise methodology for doing so, leaving the industry alone with the problem. This paper proposes a practical and ready-to-use methodology for re-identification risk assessment, the originality of which is manifold: (1) it is the first to follow well-known risk analysis methods (e.g. EBIOS) that have been used in the cybersecurity field for years, which consider not only the ability to perform an attack, but also the impact such an attack can have on an individual; (2) it is the first to qualify attributes and values of attributes with e.g. degree of exposure, as known real-world attacks mainly target certain types of attributes and not others.
Related papers
- Local Features Meet Stochastic Anonymization: Revolutionizing Privacy-Preserving Face Recognition for Black-Box Models [54.88064975480573]
The task of privacy-preserving face recognition (PPFR) currently faces two major unsolved challenges.
By disrupting global features while enhancing local features, we achieve effective recognition even in black-box environments.
Our method achieves an average recognition accuracy of 94.21% on black-box models, outperforming existing methods in both privacy protection and anti-reconstruction capabilities.
arXiv Detail & Related papers (2024-12-11T10:49:15Z) - A Human-Centered Risk Evaluation of Biometric Systems Using Conjoint Analysis [0.6199770411242359]
This paper presents a novel human-centered risk evaluation framework using conjoint analysis to quantify the impact of risk factors, such as surveillance cameras, on attacker's motivation.
Our framework calculates risk values incorporating the False Acceptance Rate (FAR) and attack probability, allowing comprehensive comparisons across use cases.
arXiv Detail & Related papers (2024-09-17T14:18:21Z) - EARBench: Towards Evaluating Physical Risk Awareness for Task Planning of Foundation Model-based Embodied AI Agents [53.717918131568936]
Embodied artificial intelligence (EAI) integrates advanced AI models into physical entities for real-world interaction.
Foundation models as the "brain" of EAI agents for high-level task planning have shown promising results.
However, the deployment of these agents in physical environments presents significant safety challenges.
This study introduces EARBench, a novel framework for automated physical risk assessment in EAI scenarios.
arXiv Detail & Related papers (2024-08-08T13:19:37Z) - Visual Privacy Auditing with Diffusion Models [52.866433097406656]
We propose a reconstruction attack based on diffusion models (DMs) that assumes adversary access to real-world image priors.
We show that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as effective auditing tools for visualizing privacy leakage.
arXiv Detail & Related papers (2024-03-12T12:18:55Z) - Model Stealing Attack against Graph Classification with Authenticity, Uncertainty and Diversity [80.16488817177182]
GNNs are vulnerable to the model stealing attack, a nefarious endeavor geared towards duplicating the target model via query permissions.
We introduce three model stealing attacks to adapt to different actual scenarios.
arXiv Detail & Related papers (2023-12-18T05:42:31Z) - Measuring Re-identification Risk [72.6715574626418]
We present a new theoretical framework to measure re-identification risk in compact user representations.
Our framework formally bounds the probability that an attacker may be able to obtain the identity of a user from their representation.
We show how our framework is general enough to model important real-world applications such as the Chrome's Topics API for interest-based advertising.
arXiv Detail & Related papers (2023-04-12T16:27:36Z) - Data AUDIT: Identifying Attribute Utility- and Detectability-Induced
Bias in Task Models [8.420252576694583]
We present a first technique for the rigorous, quantitative screening of medical image datasets.
Our method decomposes the risks associated with dataset attributes in terms of their detectability and utility.
Using our method, we show our screening method reliably identifies nearly imperceptible bias-inducing artifacts.
arXiv Detail & Related papers (2023-04-06T16:50:15Z) - A False Sense of Privacy: Towards a Reliable Evaluation Methodology for the Anonymization of Biometric Data [8.799600976940678]
Biometric data contains distinctive human traits such as facial features or gait patterns.
Privacy protection is extensively afforded by the technique of anonymization.
We assess the state-of-the-art methods used to evaluate the performance of anonymization.
arXiv Detail & Related papers (2023-04-04T08:46:14Z) - Debiasing Recommendation by Learning Identifiable Latent Confounders [49.16119112336605]
Confounding bias arises due to the presence of unmeasured variables that can affect both a user's exposure and feedback.
Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure.
We propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables to resolve the aforementioned non-identification issue.
arXiv Detail & Related papers (2023-02-10T05:10:26Z) - Assessing the risk of re-identification arising from an attack on
anonymised data [0.24466725954625884]
We calculate the risk of re-identification arising from a malicious attack to an anonymised dataset.
We present an analytical means of estimating the probability of re-identification of a single patient in a k-anonymised dataset.
We generalize this solution to obtain the probability of multiple patients being re-identified.
arXiv Detail & Related papers (2022-03-31T09:47:05Z) - Systematic Evaluation of Privacy Risks of Machine Learning Models [41.017707772150835]
We show that prior work on membership inference attacks may severely underestimate the privacy risks.
We first propose to benchmark membership inference privacy risks by improving existing non-neural network based inference attacks.
We then introduce a new approach for fine-grained privacy analysis by formulating and deriving a new metric called the privacy risk score.
arXiv Detail & Related papers (2020-03-24T00:53:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.