Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization
- URL: http://arxiv.org/abs/2206.13033v1
- Date: Mon, 27 Jun 2022 03:45:02 GMT
- Title: Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization
- Authors: Xiaodong Yang and Huishuai Zhang and Wei Chen and Tie-Yan Liu
- Abstract summary: DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
- Score: 94.06564567766475
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: By ensuring differential privacy in the learning algorithms, one can
rigorously mitigate the risk of large models memorizing sensitive training
data. In this paper, we study two algorithms for this purpose, i.e., DP-SGD and
DP-NSGD, which first clip or normalize \textit{per-sample} gradients to bound
the sensitivity and then add noise to obfuscate the exact information. We
analyze the convergence behavior of these two algorithms in the non-convex
optimization setting with two common assumptions and achieve a rate
$\mathcal{O}\left(\sqrt[4]{\frac{d\log(1/\delta)}{N^2\epsilon^2}}\right)$ of
the gradient norm for a $d$-dimensional model, $N$ samples and
$(\epsilon,\delta)$-DP, which improves over previous bounds under much weaker
assumptions. Specifically, we introduce a regularizing factor in DP-NSGD and
show that it is crucial in the convergence proof and subtly controls the bias
and noise trade-off. Our proof deliberately handles the per-sample gradient
clipping and normalization that are specified for the private setting.
Empirically, we demonstrate that these two algorithms achieve similar best
accuracy while DP-NSGD is comparatively easier to tune than DP-SGD and hence
may help further save the privacy budget when accounting the tuning effort.
Related papers
- Better Rates for Private Linear Regression in the Proportional Regime via Aggressive Clipping [19.186034457189162]
A common approach is to set the clipping constant much larger than the expected norm of the per-sample gradients.<n>While simplifying the analysis, this is however in sharp contrast with what empirical evidence suggests to optimize performance.<n>Our work bridges this gap between theory and practice by crucially operating in a regime where clipping happens frequently.
arXiv Detail & Related papers (2025-05-22T07:34:27Z) - Dyn-D$^2$P: Dynamic Differentially Private Decentralized Learning with Provable Utility Guarantee [36.82471440872803]
We propose a new Dynamic Differentially Private Decentralized DP approach (termed Dyn-D$2$P)<n>Dyn-D$2$P dynamically adjusts gradient clipping bounds and noise levels based on gradient convergence.<n>Experiments on benchmark datasets demonstrate the superiority of Dyn-D$2$ over its counterparts employing fixed-level noises.
arXiv Detail & Related papers (2025-05-10T13:57:57Z) - Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants.
There is no differentially private distributed method for training, non-feedback problems.
We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z) - Noise is All You Need: Private Second-Order Convergence of Noisy SGD [15.31952197599396]
We show that noise necessary for privacy already implies second-order convergence under the standard smoothness assumptions.
We get second-order convergence essentially for free: DP-SGD, the workhorse of modern private optimization, under minimal assumptions can be used to find a second-order stationary point.
arXiv Detail & Related papers (2024-10-09T13:43:17Z) - Differentially Private Best-Arm Identification [14.916947598339988]
Best Arm Identification (BAI) problems are progressively used for data-sensitive applications.
Motivated by the data privacy concerns invoked by these applications, we study the problem of BAI with fixed confidence in both the local and central models.
arXiv Detail & Related papers (2024-06-10T16:02:48Z) - Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy [47.997934291881414]
Existing mean estimation schemes are usually optimized for $L_infty$ geometry and rely on random rotation or Kashin's representation to adapt to $L$ geometry.
We introduce a novel privacy accounting method for the sparsified Gaussian mechanism that incorporates the randomness inherent in sparsification into the DP.
Unlike previous approaches, our accounting algorithm directly operates in $L$ geometry, yielding MSEs that fast converge to those of the Gaussian mechanism.
arXiv Detail & Related papers (2024-05-02T03:48:47Z) - Clipped SGD Algorithms for Privacy Preserving Performative Prediction: Bias Amplification and Remedies [28.699424769503764]
Clipped gradient descent (SGD) algorithms are among the most popular algorithms for privacy preserving optimization.
This paper studies the convergence properties of these algorithms in a performative prediction setting.
arXiv Detail & Related papers (2024-04-17T02:17:05Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - On Private Online Convex Optimization: Optimal Algorithms in
$\ell_p$-Geometry and High Dimensional Contextual Bandits [9.798304879986604]
We study the Differentially private (DP) convex optimization problem with streaming data sampled from a distribution and arrives sequentially.
We also consider the continual release model where parameters related to private information are updated and released upon each new data, often known as the online algorithms.
Our algorithm achieves in linear time the optimal excess risk when $1pleq 2$ and the state-of-the-art excess risk meeting the non-private lower ones when $2pleqinfty$.
arXiv Detail & Related papers (2022-06-16T12:09:47Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.