When Does Differentially Private Learning Not Suffer in High Dimensions?
- URL: http://arxiv.org/abs/2207.00160v1
- Date: Fri, 1 Jul 2022 02:36:51 GMT
- Title: When Does Differentially Private Learning Not Suffer in High Dimensions?
- Authors: Xuechen Li, Daogao Liu, Tatsunori Hashimoto, Huseyin A. Inan,
Janardhan Kulkarni, Yin Tat Lee, Abhradeep Guha Thakurta
- Abstract summary: Large pretrained models can be privately fine-tuned to achieve performance approaching that of non-private models.
This seemingly contradicts known results on the model-size dependence of differentially private convex learning.
- Score: 43.833397016656704
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large pretrained models can be privately fine-tuned to achieve performance
approaching that of non-private models. A common theme in these results is the
surprising observation that high-dimensional models can achieve favorable
privacy-utility trade-offs. This seemingly contradicts known results on the
model-size dependence of differentially private convex learning and raises the
following research question: When does the performance of differentially
private learning not degrade with increasing model size? We identify that the
magnitudes of gradients projected onto subspaces is a key factor that
determines performance. To precisely characterize this for private convex
learning, we introduce a condition on the objective that we term restricted
Lipschitz continuity and derive improved bounds for the excess empirical and
population risks that are dimension-independent under additional conditions. We
empirically show that in private fine-tuning of large language models,
gradients evaluated near a local optimum are mostly controlled by a few
principal components. This behavior is similar to conditions under which we
obtain dimension-independent bounds in convex settings. Our theoretical and
empirical results together provide a possible explanation for recent successes
in large-scale private fine-tuning.
Related papers
- Training Private Models That Know What They Don't Know [40.19666295972155]
We find that several popular selective prediction approaches are ineffective in a differentially private setting.
We propose a novel evaluation mechanism which isolate selective prediction performance across model utility levels.
arXiv Detail & Related papers (2023-05-28T12:20:07Z) - Private Gradient Estimation is Useful for Generative Modeling [25.777591229903596]
We present a new private generative modeling approach where samples are generated via Hamiltonian dynamics with gradients of the private dataset estimated by a well-trained network.
Our model is able to generate data with a resolution of 256x256.
arXiv Detail & Related papers (2023-05-18T02:51:17Z) - Enforcing Privacy in Distributed Learning with Performance Guarantees [57.14673504239551]
We study the privatization of distributed learning and optimization strategies.
We show that the popular additive random perturbation scheme degrades performance because it is not well-tuned to the graph structure.
arXiv Detail & Related papers (2023-01-16T13:03:27Z) - Differential Privacy has Bounded Impact on Fairness in Classification [7.022004731560844]
We study the impact of differential privacy on fairness in classification.
We prove that, given a class of models, popular group fairness measures are pointwise Lipschitz-continuous with respect to the parameters of the model.
arXiv Detail & Related papers (2022-10-28T16:19:26Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Learning to be adversarially robust and differentially private [42.7930886063265]
We study the difficulties in learning that arise from robust and differentially private optimization.
Data dimensionality dependent term introduced by private optimization compounds difficulties of learning a robust model.
Size of adversarial generalization and clipping norm in differential privacy both increase the curvature of the loss landscape, implying poorer performance.
arXiv Detail & Related papers (2022-01-06T22:33:06Z) - Large Language Models Can Be Strong Differentially Private Learners [70.0317718115406]
Differentially Private (DP) learning has seen limited success for building large deep learning models of text.
We show that this performance drop can be mitigated with the use of large pretrained models.
We propose a memory saving technique that allows clipping in DP-SGD to run without instantiating per-example gradients.
arXiv Detail & Related papers (2021-10-12T01:45:27Z) - The Influence of Dropout on Membership Inference in Differentially
Private Models [0.0]
Differentially private models seek to protect the privacy of data the model is trained on.
We conduct membership inference attacks against models with and without differential privacy.
arXiv Detail & Related papers (2021-03-16T12:09:51Z) - Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters.
We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z) - Understanding Gradient Clipping in Private SGD: A Geometric Perspective [68.61254575987013]
Deep learning models are increasingly popular in many machine learning applications where the training data may contain sensitive information.
Many learning systems now incorporate differential privacy by training their models with (differentially) private SGD.
A key step in each private SGD update is gradient clipping that shrinks the gradient of an individual example whenever its L2 norm exceeds some threshold.
arXiv Detail & Related papers (2020-06-27T19:08:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.