Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization
- URL: http://arxiv.org/abs/2503.18317v1
- Date: Mon, 24 Mar 2025 03:51:27 GMT
- Title: Improved Rates of Differentially Private Nonconvex-Strongly-Concave Minimax Optimization
- Authors: Ruijia Zhang, Mingxi Lei, Meng Ding, Zihang Xiang, Jinhui Xu, Di Wang,
- Abstract summary: We study the problem of (fin sum) minimax optimization in the Differential Privacy (DP) model.<n>We show that it is possible to get an estimator whose Descent $l$-norm of the empirical risk function is upper bounded by $tO(n)(n)$, whered is the sample size.
- Score: 10.913566070767596
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose $l_2$-norm of the gradient for the empirical risk function is upper bounded by $\tilde{O}(\frac{d^{1/4}}{({n\epsilon})^{1/2}})$, where $d$ is the model dimension and $n$ is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to $\tilde{O}(\frac{d^{1/3}}{(n\epsilon)^{2/3}})$, which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.
Related papers
- Differential Private Stochastic Optimization with Heavy-tailed Data: Towards Optimal Rates [15.27596975662702]
We explore algorithms achieving optimal rates of DP optimization with heavy-tailed gradients.
Our results match the minimax lower bound in citekamath2022, indicating that the theoretical limit of convex optimization under DP is achievable.
arXiv Detail & Related papers (2024-08-19T11:07:05Z) - Near-Optimal differentially private low-rank trace regression with guaranteed private initialization [0.0]
We study differentially private (DP) estimation of a rank-$r$ matrix $M in RRd_1times d$ under the trace regression model.
We also propose a differentially private algorithm for estimating $M$ based on Riemannian optimization (DP-RGrad)
It is shown that the estimator given by DP-RGrad attains the optimal convergence rate in a weaker notion of differential privacy.
arXiv Detail & Related papers (2024-03-24T03:57:21Z) - DIFF2: Differential Private Optimization via Gradient Differences for
Nonconvex Distributed Learning [58.79085525115987]
In the previous work, the best known utility bound is $widetilde O(d2/3/(nvarepsilon_mathrmDP)4/3)$.
We propose a new differential private framework called mphDIFF2 (DIFFerential private via DIFFs) that constructs a differential private framework.
$mphDIFF2 with a global descent achieves the utility of $widetilde O(d2/3/(nvarepsilon_mathrmDP)4/3
arXiv Detail & Related papers (2023-02-08T05:19:01Z) - Decentralized Stochastic Gradient Descent Ascent for Finite-Sum Minimax Problems [26.676582181833584]
Minimax problems have attracted significant attention in recent years due to their widespread application in numerous machine learning models.
We developed a novel decentralized distributed gradient descent for ascent-sum minimax problem.
Our work is first one to achieve such theoretical complexities for this kind minimax problem.
arXiv Detail & Related papers (2022-12-06T03:25:44Z) - Estimating the minimizer and the minimum value of a regression function
under passive design [72.85024381807466]
We propose a new method for estimating the minimizer $boldsymbolx*$ and the minimum value $f*$ of a smooth and strongly convex regression function $f$.
We derive non-asymptotic upper bounds for the quadratic risk and optimization error of $boldsymbolz_n$, and for the risk of estimating $f*$.
arXiv Detail & Related papers (2022-11-29T18:38:40Z) - Restricted Strong Convexity of Deep Learning Models with Smooth
Activations [31.003601717265006]
We study the problem of optimization of deep learning models with smooth activation functions.
We introduce a new analysis of optimization based on Restricted Strong Convexity (RSC)
Ours is the first result on establishing geometric convergence of GD based on RSC for deep learning models.
arXiv Detail & Related papers (2022-09-29T21:24:26Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal [70.15267479220691]
We consider and analyze the sample complexity of model reinforcement learning with a generative variance-free model.
Our analysis shows that it is nearly minimax-optimal for finding an $varepsilon$-optimal policy when $varepsilon$ is sufficiently small.
arXiv Detail & Related papers (2022-05-27T19:39:24Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Better Theory for SGD in the Nonconvex World [2.6397379133308214]
Large-scale non optimization problems are ubiquitous in modern machine learning.
We perform experiments on the effects of a wide array of synthetic minibatch sizes on the Gradient Descent (SG) problem.
arXiv Detail & Related papers (2020-02-09T09:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.