Why Is Public Pretraining Necessary for Private Model Training?
- URL: http://arxiv.org/abs/2302.09483v1
- Date: Sun, 19 Feb 2023 05:32:20 GMT
- Title: Why Is Public Pretraining Necessary for Private Model Training?
- Authors: Arun Ganesh, Mahdi Haghifam, Milad Nasr, Sewoong Oh, Thomas Steinke,
Om Thakkar, Abhradeep Thakurta, Lun Wang
- Abstract summary: We show that pretraining on publicly available data leads to distinct gains over nonprivate settings.
We argue that the tradeoff may be a deeper loss model that requires an algorithm to go through two phases.
Guided by intuition, we provide theoretical constructions that provably demonstrate the separation between private with and without public pretraining.
- Score: 50.054565310457306
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the privacy-utility tradeoff of a model trained on benchmark language and
vision tasks, remarkable improvements have been widely reported with the use of
pretraining on publicly available data. This is in part due to the benefits of
transfer learning, which is the standard motivation for pretraining in
non-private settings. However, the stark contrast in the improvement achieved
through pretraining under privacy compared to non-private settings suggests
that there may be a deeper, distinct cause driving these gains. To explain this
phenomenon, we hypothesize that the non-convex loss landscape of a model
training necessitates an optimization algorithm to go through two phases. In
the first, the algorithm needs to select a good "basin" in the loss landscape.
In the second, the algorithm solves an easy optimization within that basin. The
former is a harder problem to solve with private data, while the latter is
harder to solve with public data due to a distribution shift or data scarcity.
Guided by this intuition, we provide theoretical constructions that provably
demonstrate the separation between private training with and without public
pretraining. Further, systematic experiments on CIFAR10 and LibriSpeech provide
supporting evidence for our hypothesis.
Related papers
- On the Benefits of Public Representations for Private Transfer Learning under Distribution Shift [40.553022057469285]
We show that public pretraining can improve private training accuracy by up to 67% over private training from scratch.
We provide a theoretical explanation for this phenomenon, showing that if the public and private data share a low-dimensional representation, public representations can improve the sample complexity of private training.
arXiv Detail & Related papers (2023-12-24T21:46:14Z) - Optimal Differentially Private Model Training with Public Data [13.16576244790641]
Differential privacy (DP) ensures that training a machine learning model does not leak private data.
In practice, we may have access to auxiliary public data that is free of privacy concerns.
arXiv Detail & Related papers (2023-06-26T20:40:29Z) - Can Public Large Language Models Help Private Cross-device Federated Learning? [58.05449579773249]
We study (differentially) private federated learning (FL) of language models.
Public data has been used to improve privacy-utility trade-offs for both large and small language models.
We propose a novel distribution matching algorithm with theoretical grounding to sample public data close to private data distribution.
arXiv Detail & Related papers (2023-05-20T07:55:58Z) - Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight.
They only give tight estimates under implausible worst-case assumptions.
We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - Fine-Tuning with Differential Privacy Necessitates an Additional
Hyperparameter Search [38.83524780461911]
We show how carefully selecting the layers being fine-tuned in the pretrained neural network allows us to establish new state-of-the-art tradeoffs between privacy and accuracy.
We achieve 77.9% accuracy for $(varepsilon, delta)= (2, 10-5)$ on CIFAR-100 for a model pretrained on ImageNet.
arXiv Detail & Related papers (2022-10-05T11:32:49Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Constrained Differentially Private Federated Learning for Low-bandwidth
Devices [1.1470070927586016]
This paper presents a novel privacy-preserving federated learning scheme.
It provides theoretical privacy guarantees, as it is based on Differential Privacy.
It reduces the upstream and downstream bandwidth by up to 99.9% compared to standard federated learning.
arXiv Detail & Related papers (2021-02-27T22:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.