Differentially Private SGD with Non-Smooth Loss
- URL: http://arxiv.org/abs/2101.08925v1
- Date: Fri, 22 Jan 2021 03:19:06 GMT
- Title: Differentially Private SGD with Non-Smooth Loss
- Authors: Puyu Wang, Yunwen Lei, Yiming Ying, Hai Zhang
- Abstract summary: Loss function is relaxed to have $alpha$-H"older continuous gradient.
We prove that noisy SGD with $alpha$-H"older smooth losses using gradient perturbation can guarantee $(epsilon,delta)$-differential privacy.
- Score: 26.212935426509908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we are concerned with differentially private SGD algorithms in
the setting of stochastic convex optimization (SCO). Most of existing work
requires the loss to be Lipschitz continuous and strongly smooth, and the model
parameter to be uniformly bounded. However, these assumptions are restrictive
as many popular losses violate these conditions including the hinge loss for
SVM, the absolute loss in robust regression, and even the least square loss in
an unbounded domain. We significantly relax these restrictive assumptions and
establish privacy and generalization (utility) guarantees for private SGD
algorithms using output and gradient perturbations associated with non-smooth
convex losses. Specifically, the loss function is relaxed to have
$\alpha$-H\"{o}lder continuous gradient (referred to as $\alpha$-H\"{o}lder
smoothness) which instantiates the Lipschitz continuity ($\alpha=0$) and strong
smoothness ($\alpha=1$). We prove that noisy SGD with $\alpha$-H\"older smooth
losses using gradient perturbation can guarantee
$(\epsilon,\delta)$-differential privacy (DP) and attain optimal excess
population risk
$O\Big(\frac{\sqrt{d\log(1/\delta)}}{n\epsilon}+\frac{1}{\sqrt{n}}\Big)$, up to
logarithmic terms, with gradient complexity (i.e. the total number of
iterations) $T =O( n^{2-\alpha\over 1+\alpha}+ n).$ This shows an important
trade-off between $\alpha$-H\"older smoothness of the loss and the
computational complexity $T$ for private SGD with statistically optimal
performance. In particular, our results indicate that $\alpha$-H\"older
smoothness with $\alpha\ge {1/2}$ is sufficient to guarantee
$(\epsilon,\delta)$-DP of noisy SGD algorithms while achieving optimal excess
risk with linear gradient complexity $T = O(n).$
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - On Convergence of Incremental Gradient for Non-Convex Smooth Functions [63.51187646914962]
In machine learning and network optimization, algorithms like shuffle SGD are popular due to minimizing the number of misses and good cache.
This paper delves into the convergence properties SGD algorithms with arbitrary data ordering.
arXiv Detail & Related papers (2023-05-30T17:47:27Z) - ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density.
We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z) - Sharper Utility Bounds for Differentially Private Models [20.246768861331276]
First $mathcalObig(fracsqrtpnepsilonbig)$ high probability excess population risk bound for differentially private algorithms.
New proposed algorithm m-NGP improves the performance of the differentially private model over real datasets.
arXiv Detail & Related papers (2022-04-22T07:03:13Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Non-Euclidean Differentially Private Stochastic Convex Optimization [15.302167005107135]
We show that noisy gradient descent (SGD) algorithms attain the optimal excess risk in low-dimensional regimes.
Our work draws upon concepts from the geometry of normed spaces, such as the notions of regularity, uniform convexity, and uniform smoothness.
arXiv Detail & Related papers (2021-03-01T19:48:44Z) - Gradient-Based Empirical Risk Minimization using Local Polynomial
Regression [39.29885444997579]
A major goal of this literature has been to compare different algorithms, such as gradient descent (GD) or gradient descent (SGD)
We demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of both GD and SGD.
arXiv Detail & Related papers (2020-11-04T20:10:31Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z) - Fast Dimension Independent Private AdaGrad on Publicly Estimated
Subspaces [24.52697154861819]
We show that noisy AdaGrad achieves a regret comparable to traditional AdaGrad plus a well-controlled term due to noise.
We operate with general convex functions in both constrained and unconstrained minimization.
arXiv Detail & Related papers (2020-08-14T20:46:38Z) - Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses [52.039438701530905]
We provide sharp upper and lower bounds for several forms of gradient descent (SGD) on arbitrary Lipschitz nonsmooth convex losses.
Our bounds allow us to derive a new algorithm for differentially private nonsmooth convex optimization with optimal excess population risk.
arXiv Detail & Related papers (2020-06-12T02:45:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.