Differentially Private Coordinate Descent for Composite Empirical Risk
Minimization
- URL: http://arxiv.org/abs/2110.11688v1
- Date: Fri, 22 Oct 2021 10:22:48 GMT
- Title: Differentially Private Coordinate Descent for Composite Empirical Risk
Minimization
- Authors: Paul Mangold, Aur\'elien Bellet, Joseph Salmon, Marc Tommasi
- Abstract summary: Machine learning models can leak information about the data used to train them.
Differentially Private (DP) variants of optimization algorithms like Gradient Descent (DP-SGD) have been designed to mitigate this.
We propose a new method for composite Differentially Private Empirical Risk Minimization (DP-ERM): Differentially Private Coordinate Descent (DP-CD)
- Score: 13.742100810492014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models can leak information about the data used to train
them. Differentially Private (DP) variants of optimization algorithms like
Stochastic Gradient Descent (DP-SGD) have been designed to mitigate this,
inducing a trade-off between privacy and utility. In this paper, we propose a
new method for composite Differentially Private Empirical Risk Minimization
(DP-ERM): Differentially Private proximal Coordinate Descent (DP-CD). We
analyze its utility through a novel theoretical analysis of inexact coordinate
descent, and highlight some regimes where DP-CD outperforms DP-SGD, thanks to
the possibility of using larger step sizes. We also prove new lower bounds for
composite DP-ERM under coordinate-wise regularity assumptions, that are, in
some settings, nearly matched by our algorithm. In practical implementations,
the coordinate-wise nature of DP-CD updates demands special care in choosing
the clipping thresholds used to bound individual contributions to the
gradients. A natural parameterization of these thresholds emerges from our
theory, limiting the addition of unnecessarily large noise without requiring
coordinate-wise hyperparameter tuning or extra computational cost.
Related papers
- Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy [47.997934291881414]
Existing mean estimation schemes are usually optimized for $L_infty$ geometry and rely on random rotation or Kashin's representation to adapt to $L$ geometry.
We introduce a novel privacy accounting method for the sparsified Gaussian mechanism that incorporates the randomness inherent in sparsification into the DP.
Unlike previous approaches, our accounting algorithm directly operates in $L$ geometry, yielding MSEs that fast converge to those of the Gaussian mechanism.
arXiv Detail & Related papers (2024-05-02T03:48:47Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Practical Differentially Private Hyperparameter Tuning with Subsampling [8.022555128083026]
We propose a new class of differentially private (DP) machine learning (ML) algorithms, where the number of random search samples is randomized itself.
We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data.
We provide a R'enyi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off.
arXiv Detail & Related papers (2023-01-27T21:01:58Z) - Differentially Private Learning with Per-Sample Adaptive Clipping [8.401653565794353]
We propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function.
We show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
arXiv Detail & Related papers (2022-12-01T07:26:49Z) - High-Dimensional Private Empirical Risk Minimization by Greedy
Coordinate Descent [11.49109939095326]
We study differentially private empirical risk minimization (DP-ERM)
We show theoretically that DP-GCD can achieve a logarithmic dependence on the dimension for a wide range of problems.
arXiv Detail & Related papers (2022-07-04T16:27:00Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Automatic Clipping: Differentially Private Deep Learning Made Easier and
Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models.
We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.