Unlocking Accuracy and Fairness in Differentially Private Image
Classification
- URL: http://arxiv.org/abs/2308.10888v1
- Date: Mon, 21 Aug 2023 17:42:33 GMT
- Title: Unlocking Accuracy and Fairness in Differentially Private Image
Classification
- Authors: Leonard Berrada, Soham De, Judy Hanwen Shen, Jamie Hayes, Robert
Stanforth, David Stutz, Pushmeet Kohli, Samuel L. Smith, Borja Balle
- Abstract summary: Differential privacy (DP) is considered the gold standard framework for privacy-preserving training.
We show that pre-trained foundation models fine-tuned with DP can achieve similar accuracy to non-private classifiers.
- Score: 43.53494043189235
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Privacy-preserving machine learning aims to train models on private data
without leaking sensitive information. Differential privacy (DP) is considered
the gold standard framework for privacy-preserving training, as it provides
formal privacy guarantees. However, compared to their non-private counterparts,
models trained with DP often have significantly reduced accuracy. Private
classifiers are also believed to exhibit larger performance disparities across
subpopulations, raising fairness concerns. The poor performance of classifiers
trained with DP has prevented the widespread adoption of privacy preserving
machine learning in industry. Here we show that pre-trained foundation models
fine-tuned with DP can achieve similar accuracy to non-private classifiers,
even in the presence of significant distribution shifts between pre-training
data and downstream tasks. We achieve private accuracies within a few percent
of the non-private state of the art across four datasets, including two medical
imaging benchmarks. Furthermore, our private medical classifiers do not exhibit
larger performance disparities across demographic groups than non-private
models. This milestone to make DP training a practical and reliable technology
has the potential to widely enable machine learning practitioners to train
safely on sensitive datasets while protecting individuals' privacy.
Related papers
- Beyond the Mean: Differentially Private Prototypes for Private Transfer Learning [16.028575596905554]
We propose Differentially Private Prototype Learning (DPPL) as a new paradigm for private transfer learning.
DPPL generates prototypes that represent each private class in the embedding space and can be publicly released for inference.
We show that privacy-utility trade-offs can be further improved when leveraging the public data beyond pre-training of the encoder.
arXiv Detail & Related papers (2024-06-12T09:41:12Z) - Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging [47.99192239793597]
We evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training.
Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.
arXiv Detail & Related papers (2023-02-03T09:49:13Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Personalized PATE: Differential Privacy for Machine Learning with
Individual Privacy Guarantees [1.2691047660244335]
We propose three novel methods to support training an ML model with different personalized privacy guarantees within the training data.
Our experiments show that our personalized privacy methods yield higher accuracy models than the non-personalized baseline.
arXiv Detail & Related papers (2022-02-21T20:16:27Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Chasing Your Long Tails: Differentially Private Prediction in Health
Care Settings [34.26542589537452]
Methods for differentially private (DP) learning provide a general-purpose approach to learn models with privacy guarantees.
Modern methods for DP learning ensure privacy through mechanisms that censor information judged as too unique.
We use state-of-the-art methods for DP learning to train privacy-preserving models in clinical prediction tasks.
arXiv Detail & Related papers (2020-10-13T19:56:37Z) - Private Knowledge Transfer via Model Distillation with Generative
Adversarial Networks [7.0202040971648705]
A conventional deep learning model is prone to privacy attacks that can recover the sensitive information of individuals.
Recently, differential privacy that offers provable privacy guarantees has been proposed to train neural networks in a privacy-labelled manner to protect training data.
We present a novel private knowledge transfer strategy, where the private teacher trained on sensitive data is not publicly accessible but teaches a student to be publicly released.
arXiv Detail & Related papers (2020-04-05T12:55:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.