Unlocking High-Accuracy Differentially Private Image Classification
through Scale
- URL: http://arxiv.org/abs/2204.13650v1
- Date: Thu, 28 Apr 2022 17:10:56 GMT
- Title: Unlocking High-Accuracy Differentially Private Image Classification
through Scale
- Authors: Soham De, Leonard Berrada, Jamie Hayes, Samuel L. Smith, Borja Balle
- Abstract summary: Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points.
Previous works have found that DP-SGD often leads to a significant degradation in performance on standard image classification benchmarks.
We demonstrate that DP-SGD on over- parameterized models can perform significantly better than previously thought.
- Score: 45.93988209606857
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential Privacy (DP) provides a formal privacy guarantee preventing
adversaries with access to a machine learning model from extracting information
about individual training points. Differentially Private Stochastic Gradient
Descent (DP-SGD), the most popular DP training method, realizes this protection
by injecting noise during training. However previous works have found that
DP-SGD often leads to a significant degradation in performance on standard
image classification benchmarks. Furthermore, some authors have postulated that
DP-SGD inherently performs poorly on large models, since the norm of the noise
required to preserve privacy is proportional to the model dimension. In
contrast, we demonstrate that DP-SGD on over-parameterized models can perform
significantly better than previously thought. Combining careful hyper-parameter
tuning with simple techniques to ensure signal propagation and improve the
convergence rate, we obtain a new SOTA on CIFAR-10 of 81.4% under (8,
10^{-5})-DP using a 40-layer Wide-ResNet, improving over the previous SOTA of
71.7%. When fine-tuning a pre-trained 200-layer Normalizer-Free ResNet, we
achieve a remarkable 77.1% top-1 accuracy on ImageNet under (1, 8*10^{-7})-DP,
and achieve 81.1% under (8, 8*10^{-7})-DP. This markedly exceeds the previous
SOTA of 47.9% under a larger privacy budget of (10, 10^{-6})-DP. We believe our
results are a significant step towards closing the accuracy gap between private
and non-private image classification.
Related papers
- Training Large ASR Encoders with Differential Privacy [18.624449993983106]
Self-supervised learning (SSL) methods for large speech models have proven to be highly effective at ASR.
With the interest in public deployment of large pre-trained models, there is a rising concern for unintended memorization and leakage of sensitive data points from the training data.
This paper is the first to apply differentially private (DP) pre-training to a SOTA Conformer-based encoder, and study its performance on a downstream ASR task assuming the fine-tuning data is public.
arXiv Detail & Related papers (2024-09-21T00:01:49Z) - Pre-training Differentially Private Models with Limited Public Data [54.943023722114134]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models.
DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage.
We develop a novel DP continual pre-training strategy using only 10% of public data.
Our strategy can achieve DP accuracy of 41.5% on ImageNet-21k, as well as non-DP accuracy of 55.7% and and 60.0% on downstream tasks Places365 and iNaturalist-2021.
arXiv Detail & Related papers (2024-02-28T23:26:27Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - Differentially Private Image Classification by Learning Priors from
Random Processes [48.0766422536737]
In privacy-preserving machine learning, differentially private gradient descent (DP-SGD) performs worse than SGD due to per-sample gradient clipping and noise addition.
A recent focus in private learning research is improving the performance of DP-SGD on private data by incorporating priors that are learned on real-world public data.
In this work, we explore how we can improve the privacy-utility tradeoff of DP-SGD by learning priors from images generated by random processes and transferring these priors to private data.
arXiv Detail & Related papers (2023-06-08T04:14:32Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.