Differentially Private Image Classification from Features
- URL: http://arxiv.org/abs/2211.13403v1
- Date: Thu, 24 Nov 2022 04:04:20 GMT
- Title: Differentially Private Image Classification from Features
- Authors: Harsh Mehta, Walid Krichene, Abhradeep Thakurta, Alexey Kurakin, Ashok
Cutkosky
- Abstract summary: Leveraging transfer learning has been shown to be an effective strategy for training large models with Differential Privacy.
Recent works have found that privately training just the last layer of a pre-trained model provides the best utility with DP.
- Score: 53.75086935617644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Leveraging transfer learning has recently been shown to be an effective
strategy for training large models with Differential Privacy (DP). Moreover,
somewhat surprisingly, recent works have found that privately training just the
last layer of a pre-trained model provides the best utility with DP. While past
studies largely rely on algorithms like DP-SGD for training large models, in
the specific case of privately learning from features, we observe that
computational burden is low enough to allow for more sophisticated optimization
schemes, including second-order methods. To that end, we systematically explore
the effect of design parameters such as loss function and optimization
algorithm. We find that, while commonly used logistic regression performs
better than linear regression in the non-private setting, the situation is
reversed in the private setting. We find that linear regression is much more
effective than logistic regression from both privacy and computational aspects,
especially at stricter epsilon values ($\epsilon < 1$). On the optimization
side, we also explore using Newton's method, and find that second-order
information is quite helpful even with privacy, although the benefit
significantly diminishes with stricter privacy guarantees. While both methods
use second-order information, least squares is effective at lower epsilons
while Newton's method is effective at larger epsilon values. To combine the
benefits of both, we propose a novel algorithm called DP-FC, which leverages
feature covariance instead of the Hessian of the logistic regression loss and
performs well across all $\epsilon$ values we tried. With this, we obtain new
SOTA results on ImageNet-1k, CIFAR-100 and CIFAR-10 across all values of
$\epsilon$ typically considered. Most remarkably, on ImageNet-1K, we obtain
top-1 accuracy of 88\% under (8, $8 * 10^{-7}$)-DP and 84.3\% under (0.1, $8 *
10^{-7}$)-DP.
Related papers
- LMO-DP: Optimizing the Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models [31.718398512438238]
We propose a novel Language Model-based Optimal Differential Privacy (LMO-DP) mechanism.
It takes the first step to enable the tight composition of accurately fine-tuning language models with a sub-optimal DP mechanism.
LMO-DP is also the first solution to accurately fine-tune Llama-2 with strong differential privacy guarantees.
arXiv Detail & Related papers (2024-05-29T05:32:50Z) - Private Fine-tuning of Large Language Models with Zeroth-order Optimization [51.19403058739522]
Differentially private gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner.
We introduce DP-ZO, a private fine-tuning framework for large language models by privatizing zeroth order optimization methods.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - Scaling Up Differentially Private LASSO Regularized Logistic Regression
via Faster Frank-Wolfe Iterations [51.14495595270775]
We adapt the Frank-Wolfe algorithm for $L_1$ penalized linear regression to be aware of sparse inputs and to use them effectively.
Our results demonstrate that this procedure can reduce runtime by a factor of up to $2,200times$, depending on the value of the privacy parameter $epsilon$ and the sparsity of the dataset.
arXiv Detail & Related papers (2023-10-30T19:52:43Z) - Faster Differentially Private Convex Optimization via Second-Order
Methods [29.610397744953577]
Differentially private (stochastic) gradient descent is the workhorse of private machine learning in both the convex and non- convex world.
We design a practical second-order algorithm for the unconstrained logistic regression problem.
arXiv Detail & Related papers (2023-05-22T16:43:36Z) - Differentially Private Deep Learning with ModelMix [14.445182641912014]
We propose a generic optimization framework, called em ModelMix, which performs random aggregation of intermediate model states.
It strengthens the composite privacy analysis utilizing the entropy of the training trajectory.
We present a formal study on the effect of gradient clipping in Differentially Private Gradient Descent.
arXiv Detail & Related papers (2022-10-07T22:59:00Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Towards Alternative Techniques for Improving Adversarial Robustness:
Analysis of Adversarial Training at a Spectrum of Perturbations [5.18694590238069]
Adversarial training (AT) and its variants have spearheaded progress in improving neural network robustness to adversarial perturbations.
We focus on models, trained on a spectrum of $epsilon$ values.
We identify alternative improvements to AT that otherwise wouldn't have been apparent at a single $epsilon$.
arXiv Detail & Related papers (2022-06-13T22:01:21Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Output Perturbation for Differentially Private Convex Optimization with
Improved Population Loss Bounds, Runtimes and Applications to Private
Adversarial Training [12.386462516398469]
Finding efficient, easily implementable differentially private (DP) algorithms that offer strong excess risk bounds is an important problem in modern machine learning.
We provide the tightest known $(epsilon, 0)$-DP population loss bounds and fastest runtimes under the presence of smoothness and strong convexity.
We apply our theory to two learning frameworks: tilted ERM and adversarial learning frameworks.
arXiv Detail & Related papers (2021-02-09T08:47:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.