Related papers: Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)

URL: http://arxiv.org/abs/2203.05363v1
Date: Thu, 10 Mar 2022 13:31:08 GMT
Title: Differentially Private Learning Needs Hidden State (Or Much Faster Convergence)
Authors: Jiayuan Ye, Reza Shokri
Abstract summary: We show that differentially private learning, with a tight bound, needs hidden state privacy analysis or a fast convergence. Our converging privacy analysis, thus, shows that differentially private learning, with a tight bound, needs hidden state privacy analysis or a fast convergence.
Score: 9.429448411561541
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differential privacy analysis of randomized learning algorithms typically relies on composition theorems, where the implicit assumption is that the internal state of the iterative algorithm is revealed to the adversary. However, by assuming hidden states for DP algorithms (when only the last-iterate is observable), recent works prove a converging privacy bound for noisy gradient descent (on strongly convex smooth loss function) that is significantly smaller than composition bounds after $O(1/\text{step-size})$ epochs. In this paper, we extend this hidden-state analysis to the noisy mini-batch stochastic gradient descent algorithms on strongly-convex smooth loss functions. We prove converging R\'enyi DP bounds under various mini-batch sampling schemes, such as "shuffle and partition" (which are used in practical implementations of DP-SGD) and "sampling without replacement". We prove that, in these settings, our privacy bound is much smaller than the composition bound for training with a large number of iterations (which is the case for learning from high-dimensional data). Our converging privacy analysis, thus, shows that differentially private learning, with a tight bound, needs hidden state privacy analysis or a fast convergence. To complement our theoretical results, we run experiment on training classification models on MNIST, FMNIST and CIFAR-10 datasets, and observe a better accuracy given fixed privacy budgets, under the hidden-state analysis.

Related papers

The Cost of Shuffling in Private Gradient Based Optimization [40.31928071333575]
We show that data shuffling results in worse empirical excess risk for textitDP-ShuffleG compared to DP-SGD.<n>We propose textitInterleaved-ShuffleG, a hybrid approach that integrates public data samples in private optimization.
arXiv Detail & Related papers (2025-02-05T22:30:00Z)
Privacy of the last iterate in cyclically-sampled DP-SGD on nonconvex composite losses [2.532202013576547]
Differentially-private gradients (DP-SGD) is a family of iterative machine learning algorithms that iterate to generate a sequence of differentially-private (DP) model parameters. Last gradientrate accounting is challenging, and existing works require strong assumptions not satisfied by most implementations. We provide new Renyi differential privacy (R) upper bounds for the last iterate under realistic assumptions of small stepsize and Lipschitz smoothness of the loss function.
arXiv Detail & Related papers (2024-07-07T02:35:55Z)
Approximating Two-Layer ReLU Networks for Hidden State Analysis in Differential Privacy [3.8254443661593633]
We show that it is possible to privately train convex problems with privacy-utility trade-offs comparable to those of one hidden-layer ReLU networks trained with DP-SGD. Our experiments on benchmark classification tasks show that NoisyCGD can achieve privacy-utility trade-offs comparable to DP-SGD applied to one-hidden-layer ReLU networks.
arXiv Detail & Related papers (2024-07-05T22:43:32Z)
Privacy Loss of Noisy Stochastic Gradient Descent Might Converge Even for Non-Convex Losses [4.68299658663016]
The Noisy-SGD algorithm is widely used for privately training machine learning models. Recent findings have shown that if the internal state remains hidden, then the privacy loss might remain bounded. We address this problem for DP-SGD, a popular variant of Noisy-SGD that incorporates gradient clipping to limit the impact of individual samples on the training process.
arXiv Detail & Related papers (2023-05-17T02:25:56Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We find that most examples enjoy stronger privacy guarantees than the worst-case bound. This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z)
Differentially Private Federated Learning via Inexact ADMM with Multiple Local Updates [0.0]
We develop a DP inexact alternating direction method of multipliers algorithm with multiple local updates for federated learning. We show that our algorithm provides $barepsilon$-DP for every iteration, where $barepsilon$ is a privacy budget controlled by the user. We demonstrate that our algorithm reduces the testing error by at most $31%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2022-02-18T19:58:47Z)
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement. For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts. We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data. Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z)
SGD with shuffling: optimal rates without component convexity and large epoch requirements [60.65928290219793]
We consider the RandomShuffle (shuffle at the beginning of each epoch) and SingleShuffle (shuffle only once) We establish minimax optimal convergence rates of these algorithms up to poly-log factor gaps. We further sharpen the tight convergence results for RandomShuffle by removing the drawbacks common to all prior arts.
arXiv Detail & Related papers (2020-06-12T05:00:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.