Private Adaptive Gradient Methods for Convex Optimization
- URL: http://arxiv.org/abs/2106.13756v1
- Date: Fri, 25 Jun 2021 16:46:45 GMT
- Title: Private Adaptive Gradient Methods for Convex Optimization
- Authors: Hilal Asi, John Duchi, Alireza Fallah, Omid Javidbakht, Kunal Talwar
- Abstract summary: We propose and analyze differentially private variants of a Gradient Descent (SGD) algorithm with adaptive stepsizes.
We provide upper bounds on the regret of both algorithms and show that the bounds are (worst-case) optimal.
- Score: 32.3523019355048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study adaptive methods for differentially private convex optimization,
proposing and analyzing differentially private variants of a Stochastic
Gradient Descent (SGD) algorithm with adaptive stepsizes, as well as the
AdaGrad algorithm. We provide upper bounds on the regret of both algorithms and
show that the bounds are (worst-case) optimal. As a consequence of our
development, we show that our private versions of AdaGrad outperform adaptive
SGD, which in turn outperforms traditional SGD in scenarios with non-isotropic
gradients where (non-private) Adagrad provably outperforms SGD. The major
challenge is that the isotropic noise typically added for privacy dominates the
signal in gradient geometry for high-dimensional problems; approaches to this
that effectively optimize over lower-dimensional subspaces simply ignore the
actual problems that varying gradient geometries introduce. In contrast, we
study non-isotropic clipping and noise addition, developing a principled
theoretical approach; the consequent procedures also enjoy significantly
stronger empirical performance than prior approaches.
Related papers
- Convergence Analysis of Adaptive Gradient Methods under Refined Smoothness and Noise Assumptions [18.47705532817026]
We show that AdaGrad outperforms SGD by a factor of $d$ under certain conditions.
Motivated by this, we introduce assumptions on the smoothness structure of the objective and the gradient variance.
arXiv Detail & Related papers (2024-06-07T02:55:57Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Adaptive Proximal Gradient Method for Convex Optimization [18.681222155879656]
We explore two fundamental first-order algorithms in convex optimization, namely gradient descent (GD) and proximal gradient method (ProxGD)
Our focus is on making these algorithms entirely adaptive by leveraging local curvature information of smooth functions.
We propose adaptive versions of GD and ProxGD that are based on observed gradient differences and, thus, have no added computational costs.
arXiv Detail & Related papers (2023-08-04T11:37:08Z) - Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection.
This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z) - On the SDEs and Scaling Rules for Adaptive Gradient Algorithms [45.007261870784475]
Approxing Gradient Descent (SGD) as a Differential Equation (SDE) has allowed researchers to enjoy the benefits of studying a continuous optimization trajectory.
This paper derives the SDE approximations for RMSprop and Adam, giving theoretical guarantees of correctness as well as experimental validation of their applicability.
arXiv Detail & Related papers (2022-05-20T16:39:03Z) - Adaptive Differentially Private Empirical Risk Minimization [95.04948014513226]
We propose an adaptive (stochastic) gradient perturbation method for differentially private empirical risk minimization.
We prove that the ADP method considerably improves the utility guarantee compared to the standard differentially private method in which vanilla random noise is added.
arXiv Detail & Related papers (2021-10-14T15:02:20Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.