Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds
- URL: http://arxiv.org/abs/2006.13501v2
- Date: Mon, 10 Aug 2020 20:25:35 GMT
- Title: Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds
- Authors: Yingxue Zhou, Xiangyi Chen, Mingyi Hong, Zhiwei Steven Wu, Arindam
Banerjee
- Abstract summary: We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
- Score: 72.63031036770425
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study differentially private (DP) algorithms for stochastic non-convex
optimization. In this problem, the goal is to minimize the population loss over
a $p$-dimensional space given $n$ i.i.d. samples drawn from a distribution. We
improve upon the population gradient bound of ${\sqrt{p}}/{\sqrt{n}}$ from
prior work and obtain a sharper rate of $\sqrt[4]{p}/\sqrt{n}$. We obtain this
rate by providing the first analyses on a collection of private gradient-based
methods, including adaptive algorithms DP RMSProp and DP Adam. Our proof
technique leverages the connection between differential privacy and adaptive
data analysis to bound gradient estimation error at every iterate, which
circumvents the worse generalization bound from the standard uniform
convergence argument. Finally, we evaluate the proposed algorithms on two
popular deep learning tasks and demonstrate the empirical advantages of DP
adaptive gradient methods over standard DP SGD.
Related papers
- Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates [15.27596975662702]
We explore algorithms achieving optimal rates of DP optimization with heavy-tailed gradients.
Our results match the minimax lower bound in citekamath2022, indicating that the theoretical limit of convex optimization under DP is achievable.
arXiv Detail & Related papers (2024-08-19T11:07:05Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Stochastic Dimension-reduced Second-order Methods for Policy
Optimization [11.19708535159457]
We propose several new second-order algorithms for policy optimization that only require gradient and Hessian-vector product in each iteration.
Specifically, we propose a dimension-reduced second-order method (DR-SOPO) which repeatedly solves a projected two-dimensional trust region subproblem.
We show that DR-SOPO obtains an $mathcalO(epsilon-3.5)$ complexity for reaching approximate first-order stationary condition.
In addition, we present an enhanced algorithm (DVR-SOPO) which further improves the complexity to $mathcalO
arXiv Detail & Related papers (2023-01-28T12:09:58Z) - Privately Estimating a Gaussian: Efficient, Robust and Optimal [6.901744415870126]
We give efficient algorithms for privately estimating a Gaussian distribution in both pure and approximate differential privacy (DP) models.
In the pure DP setting, we give an efficient algorithm that estimates an unknown $d$-dimensional Gaussian distribution up to an arbitrary tiny total variation error.
For the special case of mean estimation, our algorithm achieves the optimal sample complexity of $widetilde O(d)$, improving on a $widetilde O(d1.5)$ bound from prior work.
arXiv Detail & Related papers (2022-12-15T18:27:39Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - On Private Online Convex Optimization: Optimal Algorithms in
$\ell_p$-Geometry and High Dimensional Contextual Bandits [9.798304879986604]
We study the Differentially private (DP) convex optimization problem with streaming data sampled from a distribution and arrives sequentially.
We also consider the continual release model where parameters related to private information are updated and released upon each new data, often known as the online algorithms.
Our algorithm achieves in linear time the optimal excess risk when $1pleq 2$ and the state-of-the-art excess risk meeting the non-private lower ones when $2pleqinfty$.
arXiv Detail & Related papers (2022-06-16T12:09:47Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions.
A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples.
We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.