Reconciling AI Performance and Data Reconstruction Resilience for
Medical Imaging
- URL: http://arxiv.org/abs/2312.04590v1
- Date: Tue, 5 Dec 2023 12:21:30 GMT
- Title: Reconciling AI Performance and Data Reconstruction Resilience for
Medical Imaging
- Authors: Alexander Ziller, Tamara T. Mueller, Simon Stieger, Leonhard Feiner,
Johannes Brandt, Rickmer Braren, Daniel Rueckert, Georgios Kaissis
- Abstract summary: Artificial Intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive.
Differential Privacy (DP) aims to circumvent these susceptibilities by setting a quantifiable privacy budget.
We show that using very large privacy budgets can render reconstruction attacks impossible, while drops in performance are negligible.
- Score: 52.578054703818125
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Artificial Intelligence (AI) models are vulnerable to information leakage of
their training data, which can be highly sensitive, for example in medical
imaging. Privacy Enhancing Technologies (PETs), such as Differential Privacy
(DP), aim to circumvent these susceptibilities. DP is the strongest possible
protection for training models while bounding the risks of inferring the
inclusion of training samples or reconstructing the original data. DP achieves
this by setting a quantifiable privacy budget. Although a lower budget
decreases the risk of information leakage, it typically also reduces the
performance of such models. This imposes a trade-off between robust performance
and stringent privacy. Additionally, the interpretation of a privacy budget
remains abstract and challenging to contextualize. In this study, we contrast
the performance of AI models at various privacy budgets against both,
theoretical risk bounds and empirical success of reconstruction attacks. We
show that using very large privacy budgets can render reconstruction attacks
impossible, while drops in performance are negligible. We thus conclude that
not using DP -- at all -- is negligent when applying AI models to sensitive
data. We deem those results to lie a foundation for further debates on striking
a balance between privacy risks and model performance.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Visual Privacy Auditing with Diffusion Models [52.866433097406656]
We propose a reconstruction attack based on diffusion models (DMs) that assumes adversary access to real-world image priors.
We show that (1) real-world data priors significantly influence reconstruction success, (2) current reconstruction bounds do not model the risk posed by data priors well, and (3) DMs can serve as effective auditing tools for visualizing privacy leakage.
arXiv Detail & Related papers (2024-03-12T12:18:55Z) - Privacy Constrained Fairness Estimation for Decision Trees [2.9906966931843093]
Measuring the fairness of any AI model requires the sensitive attributes of the individuals in the dataset.
We propose a novel method, dubbed Privacy-Aware Fairness Estimation of Rules (PAFER)
We show that using the Laplacian mechanism, the method is able to estimate SP with low error while guaranteeing the privacy of the individuals in the dataset with high certainty.
arXiv Detail & Related papers (2023-12-13T14:54:48Z) - TeD-SPAD: Temporal Distinctiveness for Self-supervised
Privacy-preservation for video Anomaly Detection [59.04634695294402]
Video anomaly detection (VAD) without human monitoring is a complex computer vision task.
Privacy leakage in VAD allows models to pick up and amplify unnecessary biases related to people's personal information.
We propose TeD-SPAD, a privacy-aware video anomaly detection framework that destroys visual private information in a self-supervised manner.
arXiv Detail & Related papers (2023-08-21T22:42:55Z) - Discriminative Adversarial Privacy: Balancing Accuracy and Membership
Privacy in Neural Networks [7.0895962209555465]
Discriminative Adversarial Privacy (DAP) is a learning technique designed to achieve a balance between model performance, speed, and privacy.
DAP relies on adversarial training based on a novel loss function able to minimise the prediction error while maximising the MIA's error.
In addition, we introduce a novel metric named Accuracy Over Privacy (AOP) to capture the performance-privacy trade-off.
arXiv Detail & Related papers (2023-06-05T17:25:45Z) - Graphical vs. Deep Generative Models: Measuring the Impact of Differentially Private Mechanisms and Budgets on Utility [18.213030598476198]
We compare graphical and deep generative models, focusing on the key factors contributing to how privacy budgets are spent.
We find that graphical models distribute privacy budgets horizontally and thus cannot handle relatively wide datasets for a fixed training time.
Deep generative models spend their budgets per iteration, so their behavior is less predictable with varying dataset dimensions.
arXiv Detail & Related papers (2023-05-18T14:14:42Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Improved Generalization Guarantees in Restricted Data Models [16.193776814471768]
Differential privacy is known to protect against threats to validity incurred due to adaptive, or exploratory, data analysis.
We show that, under this assumption, it is possible to "re-use" privacy budget on different portions of the data, significantly improving accuracy without increasing the risk of overfitting.
arXiv Detail & Related papers (2022-07-20T16:04:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.