Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization
- URL: http://arxiv.org/abs/2206.13033v1
- Date: Mon, 27 Jun 2022 03:45:02 GMT
- Title: Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization
- Authors: Xiaodong Yang and Huishuai Zhang and Wei Chen and Tie-Yan Liu
- Abstract summary: DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
- Score: 94.06564567766475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: By ensuring differential privacy in the learning algorithms, one can
rigorously mitigate the risk of large models memorizing sensitive training
data. In this paper, we study two algorithms for this purpose, i.e., DP-SGD and
DP-NSGD, which first clip or normalize \textit{per-sample} gradients to bound
the sensitivity and then add noise to obfuscate the exact information. We
analyze the convergence behavior of these two algorithms in the non-convex
optimization setting with two common assumptions and achieve a rate
$\mathcal{O}\left(\sqrt[4]{\frac{d\log(1/\delta)}{N^2\epsilon^2}}\right)$ of
the gradient norm for a $d$-dimensional model, $N$ samples and
$(\epsilon,\delta)$-DP, which improves over previous bounds under much weaker
assumptions. Specifically, we introduce a regularizing factor in DP-NSGD and
show that it is crucial in the convergence proof and subtly controls the bias
and noise trade-off. Our proof deliberately handles the per-sample gradient
clipping and normalization that are specified for the private setting.
Empirically, we demonstrate that these two algorithms achieve similar best
accuracy while DP-NSGD is comparatively easier to tune than DP-SGD and hence
may help further save the privacy budget when accounting the tuning effort.
Related papers
- Differentially Private Best-Arm Identification [14.916947598339988]
Best Arm Identification (BAI) problems are progressively used for data-sensitive applications.
Motivated by the data privacy concerns invoked by these applications, we study the problem of BAI with fixed confidence in both the local and central models.
arXiv Detail & Related papers (2024-06-10T16:02:48Z) - Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy [47.997934291881414]
Existing mean estimation schemes are usually optimized for $L_infty$ geometry and rely on random rotation or Kashin's representation to adapt to $L$ geometry.
We introduce a novel privacy accounting method for the sparsified Gaussian mechanism that incorporates the randomness inherent in sparsification into the DP.
Unlike previous approaches, our accounting algorithm directly operates in $L$ geometry, yielding MSEs that fast converge to those of the Gaussian mechanism.
arXiv Detail & Related papers (2024-05-02T03:48:47Z) - Clipped SGD Algorithms for Privacy Preserving Performative Prediction: Bias Amplification and Remedies [28.699424769503764]
Clipped gradient descent (SGD) algorithms are among the most popular algorithms for privacy preserving optimization.
This paper studies the convergence properties of these algorithms in a performative prediction setting.
arXiv Detail & Related papers (2024-04-17T02:17:05Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - On Private Online Convex Optimization: Optimal Algorithms in
$\ell_p$-Geometry and High Dimensional Contextual Bandits [9.798304879986604]
We study the Differentially private (DP) convex optimization problem with streaming data sampled from a distribution and arrives sequentially.
We also consider the continual release model where parameters related to private information are updated and released upon each new data, often known as the online algorithms.
Our algorithm achieves in linear time the optimal excess risk when $1pleq 2$ and the state-of-the-art excess risk meeting the non-private lower ones when $2pleqinfty$.
arXiv Detail & Related papers (2022-06-16T12:09:47Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.