Improved Rates for Differentially Private Stochastic Convex Optimization
with Heavy-Tailed Data
- URL: http://arxiv.org/abs/2106.01336v2
- Date: Thu, 3 Jun 2021 04:40:12 GMT
- Title: Improved Rates for Differentially Private Stochastic Convex Optimization
with Heavy-Tailed Data
- Authors: Gautam Kamath, Xingtu Liu, Huanyu Zhang
- Abstract summary: We study convex optimization with heavy-tailed data under the constraint of differential privacy.
We prove nearly-matching lower bounds under the constraint of pure differential privacy, giving strong evidence that our bounds are tight.
- Score: 13.465471169118974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study stochastic convex optimization with heavy-tailed data under the
constraint of differential privacy. Most prior work on this problem is
restricted to the case where the loss function is Lipschitz. Instead, as
introduced by Wang, Xiao, Devadas, and Xu, we study general convex loss
functions with the assumption that the distribution of gradients has bounded
$k$-th moments. We provide improved upper bounds on the excess population risk
under approximate differential privacy of
$\tilde{O}\left(\sqrt{\frac{d}{n}}+\left(\frac{d}{\epsilon
n}\right)^{\frac{k-1}{k}}\right)$ and
$\tilde{O}\left(\frac{d}{n}+\left(\frac{d}{\epsilon
n}\right)^{\frac{2k-2}{k}}\right)$ for convex and strongly convex loss
functions, respectively. We also prove nearly-matching lower bounds under the
constraint of pure differential privacy, giving strong evidence that our bounds
are tight.
Related papers
- Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions [19.008521454738425]
We study the problem of differentially private convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a $ktextth$-moment bound on the Lipschitz constants of sample functions rather than a uniform bound.
We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error $Gcdot frac 1 sqrt n + G_k cdot (fracsqrt dnepsilon)1
arXiv Detail & Related papers (2024-06-04T21:26:29Z) - High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise [59.25598762373543]
We show that wetailed high-prob convergence guarantees of learning on streaming data in the presence of heavy-tailed noise.
We demonstrate analytically and that $ta$ can be used to the preferred choice of setting for a given problem.
arXiv Detail & Related papers (2023-10-28T18:53:41Z) - Better and Simpler Lower Bounds for Differentially Private Statistical
Estimation [7.693388437377614]
We prove that for any $alpha le O(1)$, estimating the covariance of a Gaussian up to spectral error $alpha$ requires $tildeOmegaleft(fracd3/2alpha varepsilon + fracdalpha2right)$ samples.
Next, we prove that estimating the mean of a heavy-tailed distribution with bounded $k$th moments requires $tildeOmegaleft(fracdalphak/(k-1) varepsilon +
arXiv Detail & Related papers (2023-10-10T04:02:43Z) - DIFF2: Differential Private Optimization via Gradient Differences for
Nonconvex Distributed Learning [58.79085525115987]
In the previous work, the best known utility bound is $widetilde O(d2/3/(nvarepsilon_mathrmDP)4/3)$.
We propose a new differential private framework called mphDIFF2 (DIFFerential private via DIFFs) that constructs a differential private framework.
$mphDIFF2 with a global descent achieves the utility of $widetilde O(d2/3/(nvarepsilon_mathrmDP)4/3
arXiv Detail & Related papers (2023-02-08T05:19:01Z) - Private Stochastic Optimization With Large Worst-Case Lipschitz Parameter [14.040676498310198]
We study differentially private (DP) inequality optimization (SO) with loss functions whose worst-case Lipschitz parameter over all data may be infinite.
For smooth loss functions, we provide linear-time algorithms with state-of-the-art excess risk.
We also provide the first algorithm to handle non-optimal convex loss functions.
arXiv Detail & Related papers (2022-09-15T16:03:23Z) - Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$.
We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z) - High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad
Stepsize [55.0090961425708]
We propose a new, simplified high probability analysis of AdaGrad for smooth, non- probability problems.
We present our analysis in a modular way and obtain a complementary $mathcal O (1 / TT)$ convergence rate in the deterministic setting.
To the best of our knowledge, this is the first high probability for AdaGrad with a truly adaptive scheme, i.e., completely oblivious to the knowledge of smoothness.
arXiv Detail & Related papers (2022-04-06T13:50:33Z) - High-probability Bounds for Non-Convex Stochastic Optimization with
Heavy Tails [55.561406656549686]
We consider non- Hilbert optimization using first-order algorithms for which the gradient estimates may have tails.
We show that a combination of gradient, momentum, and normalized gradient descent convergence to critical points in high-probability with best-known iteration for smooth losses.
arXiv Detail & Related papers (2021-06-28T00:17:01Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Differentially Private SGD with Non-Smooth Loss [26.212935426509908]
Loss function is relaxed to have $alpha$-H"older continuous gradient.
We prove that noisy SGD with $alpha$-H"older smooth losses using gradient perturbation can guarantee $(epsilon,delta)$-differential privacy.
arXiv Detail & Related papers (2021-01-22T03:19:06Z) - On Differentially Private Stochastic Convex Optimization with
Heavy-tailed Data [18.351352536791453]
We consider the problem of designing Differentially Private (DP) algorithms for Convex Optimization (SCO) on heavy-tailed data.
The irregularity of such data violates some key assumptions used in almost all existing DP-SCO and DP-ERM methods.
We show that our algorithms can effectively deal with the challenges caused by data irregularity.
arXiv Detail & Related papers (2020-10-21T15:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.