Chasing Your Long Tails: Differentially Private Prediction in Health
Care Settings
- URL: http://arxiv.org/abs/2010.06667v1
- Date: Tue, 13 Oct 2020 19:56:37 GMT
- Title: Chasing Your Long Tails: Differentially Private Prediction in Health
Care Settings
- Authors: Vinith M. Suriyakumar, Nicolas Papernot, Anna Goldenberg, Marzyeh
Ghassemi
- Abstract summary: Methods for differentially private (DP) learning provide a general-purpose approach to learn models with privacy guarantees.
Modern methods for DP learning ensure privacy through mechanisms that censor information judged as too unique.
We use state-of-the-art methods for DP learning to train privacy-preserving models in clinical prediction tasks.
- Score: 34.26542589537452
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models in health care are often deployed in settings where
it is important to protect patient privacy. In such settings, methods for
differentially private (DP) learning provide a general-purpose approach to
learn models with privacy guarantees. Modern methods for DP learning ensure
privacy through mechanisms that censor information judged as too unique. The
resulting privacy-preserving models, therefore, neglect information from the
tails of a data distribution, resulting in a loss of accuracy that can
disproportionately affect small groups. In this paper, we study the effects of
DP learning in health care. We use state-of-the-art methods for DP learning to
train privacy-preserving models in clinical prediction tasks, including x-ray
classification of images and mortality prediction in time series data. We use
these models to perform a comprehensive empirical investigation of the
tradeoffs between privacy, utility, robustness to dataset shift, and fairness.
Our results highlight lesser-known limitations of methods for DP learning in
health care, models that exhibit steep tradeoffs between privacy and utility,
and models whose predictions are disproportionately influenced by large
demographic groups in the training data. We discuss the costs and benefits of
differentially private learning in health care.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - LLM-based Privacy Data Augmentation Guided by Knowledge Distillation
with a Distribution Tutor for Medical Text Classification [67.92145284679623]
We propose a DP-based tutor that models the noised private distribution and controls samples' generation with a low privacy cost.
We theoretically analyze our model's privacy protection and empirically verify our model.
arXiv Detail & Related papers (2024-02-26T11:52:55Z) - Unlocking Accuracy and Fairness in Differentially Private Image
Classification [43.53494043189235]
Differential privacy (DP) is considered the gold standard framework for privacy-preserving training.
We show that pre-trained foundation models fine-tuned with DP can achieve similar accuracy to non-private classifiers.
arXiv Detail & Related papers (2023-08-21T17:42:33Z) - Independent Distribution Regularization for Private Graph Embedding [55.24441467292359]
Graph embeddings are susceptible to attribute inference attacks, which allow attackers to infer private node attributes from the learned graph embeddings.
To address these concerns, privacy-preserving graph embedding methods have emerged.
We propose a novel approach called Private Variational Graph AutoEncoders (PVGAE) with the aid of independent distribution penalty as a regularization term.
arXiv Detail & Related papers (2023-08-16T13:32:43Z) - Vision Through the Veil: Differential Privacy in Federated Learning for
Medical Image Classification [15.382184404673389]
The proliferation of deep learning applications in healthcare calls for data aggregation across various institutions.
Privacy-preserving mechanisms are paramount in medical image analysis, where the data being sensitive in nature.
This study addresses the need by integrating differential privacy, a leading privacy-preserving technique, into a federated learning framework for medical image classification.
arXiv Detail & Related papers (2023-06-30T16:48:58Z) - Training Private Models That Know What They Don't Know [40.19666295972155]
We find that several popular selective prediction approaches are ineffective in a differentially private setting.
We propose a novel evaluation mechanism which isolate selective prediction performance across model utility levels.
arXiv Detail & Related papers (2023-05-28T12:20:07Z) - Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging [47.99192239793597]
We evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training.
Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.
arXiv Detail & Related papers (2023-02-03T09:49:13Z) - Practical Challenges in Differentially-Private Federated Survival
Analysis of Medical Data [57.19441629270029]
In this paper, we take advantage of the inherent properties of neural networks to federate the process of training of survival analysis models.
In the realistic setting of small medical datasets and only a few data centers, this noise makes it harder for the models to converge.
We propose DPFed-post which adds a post-processing stage to the private federated learning scheme.
arXiv Detail & Related papers (2022-02-08T10:03:24Z) - Anonymizing Data for Privacy-Preserving Federated Learning [3.3673553810697827]
We propose the first syntactic approach for offering privacy in the context of federated learning.
Our approach aims to maximize utility or model performance, while supporting a defensible level of privacy.
We perform a comprehensive empirical evaluation on two important problems in the healthcare domain, using real-world electronic health data of 1 million patients.
arXiv Detail & Related papers (2020-02-21T02:30:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.