Differentially Private SGDA for Minimax Problems
- URL: http://arxiv.org/abs/2201.09046v1
- Date: Sat, 22 Jan 2022 13:05:39 GMT
- Title: Differentially Private SGDA for Minimax Problems
- Authors: Zhenhuan Yang, Shu Hu, Yunwen Lei, Kush R. Varshney, Siwei Lyu, Yiming
Ying
- Abstract summary: We prove that gradient descent ascent (SGDA) can achieve optimal utility in terms of weak primal-dual population risk.
This is the first-ever-known result for non-smoothly-strongly-concave setting.
- Score: 83.57322009102973
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic gradient descent ascent (SGDA) and its variants have been the
workhorse for solving minimax problems. However, in contrast to the
well-studied stochastic gradient descent (SGD) with differential privacy (DP)
constraints, there is little work on understanding the generalization (utility)
of SGDA with DP constraints. In this paper, we use the algorithmic stability
approach to establish the generalization (utility) of DP-SGDA in different
settings. In particular, for the convex-concave setting, we prove that the
DP-SGDA can achieve an optimal utility rate in terms of the weak primal-dual
population risk in both smooth and non-smooth cases. To our best knowledge,
this is the first-ever-known result for DP-SGDA in the non-smooth case. We
further provide its utility analysis in the nonconvex-strongly-concave setting
which is the first-ever-known result in terms of the primal population risk.
The convergence and generalization results for this nonconvex setting are new
even in the non-private setting. Finally, numerical experiments are conducted
to demonstrate the effectiveness of DP-SGDA for both convex and nonconvex
cases.
Related papers
- Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Implicit Regularization or Implicit Conditioning? Exact Risk
Trajectories of SGD in High Dimensions [26.782342518986503]
gradient descent (SGD) is a pillar of modern machine learning, serving as the go-to optimization algorithm for a diverse array of problems.
We show how to adapt the HSGD formalism to include streaming SGD, which allows us to produce an exact prediction for the excess risk of multi-pass SGD relative to that of streaming SGD.
arXiv Detail & Related papers (2022-06-15T02:32:26Z) - Dimension Independent Generalization of DP-SGD for Overparameterized
Smooth Convex Optimization [24.644583626705742]
This paper considers the generalization performance of differentially private convex learning.
We demonstrate that the convergence analysis of Langevin algorithms can be used to obtain new generalization bounds with differential privacy guarantees for DP-SGD.
arXiv Detail & Related papers (2022-06-03T22:03:05Z) - Risk Bounds of Multi-Pass SGD for Least Squares in the Interpolation
Regime [127.21287240963859]
gradient descent (SGD) has achieved great success due to its superior performance in both optimization and generalization.
This paper aims to sharply characterize the generalization of multi-pass SGD.
We show that although SGD needs more than GD to achieve the same level of excess risk, it saves the number of gradient evaluations.
arXiv Detail & Related papers (2022-03-07T06:34:53Z) - Benign Underfitting of Stochastic Gradient Descent [72.38051710389732]
We study to what extent may gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit training data.
We analyze the closely related with-replacement SGD, for which an analogous phenomenon does not occur and prove that its population risk does in fact converge at the optimal rate.
arXiv Detail & Related papers (2022-02-27T13:25:01Z) - Improving Differentially Private SGD via Randomly Sparsified Gradients [31.295035726077366]
Differentially private gradient observation (DP-SGD) has been widely adopted in deep learning to provide rigorously defined privacy bound compression.
We propose an and utilize RS to strengthen communication cost and strengthen privacy bound compression.
arXiv Detail & Related papers (2021-12-01T21:43:34Z) - Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses [52.039438701530905]
We provide sharp upper and lower bounds for several forms of gradient descent (SGD) on arbitrary Lipschitz nonsmooth convex losses.
Our bounds allow us to derive a new algorithm for differentially private nonsmooth convex optimization with optimal excess population risk.
arXiv Detail & Related papers (2020-06-12T02:45:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.