Better Rates for Private Linear Regression in the Proportional Regime via Aggressive Clipping
- URL: http://arxiv.org/abs/2505.16329v1
- Date: Thu, 22 May 2025 07:34:27 GMT
- Title: Better Rates for Private Linear Regression in the Proportional Regime via Aggressive Clipping
- Authors: Simone Bombari, Inbar Seroussi, Marco Mondelli,
- Abstract summary: A common approach is to set the clipping constant much larger than the expected norm of the per-sample gradients.<n>While simplifying the analysis, this is however in sharp contrast with what empirical evidence suggests to optimize performance.<n>Our work bridges this gap between theory and practice by crucially operating in a regime where clipping happens frequently.
- Score: 19.186034457189162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differentially private (DP) linear regression has received significant attention in the recent theoretical literature, with several works aimed at obtaining improved error rates. A common approach is to set the clipping constant much larger than the expected norm of the per-sample gradients. While simplifying the analysis, this is however in sharp contrast with what empirical evidence suggests to optimize performance. Our work bridges this gap between theory and practice: we provide sharper rates for DP stochastic gradient descent (DP-SGD) by crucially operating in a regime where clipping happens frequently. Specifically, we consider the setting where the data is multivariate Gaussian, the number of training samples $n$ is proportional to the input dimension $d$, and the algorithm guarantees constant-order zero concentrated DP. Our method relies on establishing a deterministic equivalent for the trajectory of DP-SGD in terms of a family of ordinary differential equations (ODEs). As a consequence, the risk of DP-SGD is bounded between two ODEs, with upper and lower bounds matching for isotropic data. By studying these ODEs when $n / d$ is large enough, we demonstrate the optimality of aggressive clipping, and we uncover the benefits of decaying learning rate and private noise scheduling.
Related papers
- Learning Curves of Stochastic Gradient Descent in Kernel Regression [7.063108005500741]
We analyze the single-pass Gradient Descent (SGD) in kernel regression under source condition.<n>Surprisingly, we show that SGD achieves min-max optimal rates up to constants among all the scales.<n>The main reason for SGD to overcome the curse of saturation is the exponentially decaying step size schedule.
arXiv Detail & Related papers (2025-05-28T07:16:11Z) - Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants.<n>There is no differentially private distributed method for training, non-feedback problems.<n>We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z) - The Cost of Shuffling in Private Gradient Based Optimization [40.31928071333575]
We show that data shuffling results in worse empirical excess risk for textitDP-ShuffleG compared to DP-SGD.<n>We propose textitInterleaved-ShuffleG, a hybrid approach that integrates public data samples in private optimization.
arXiv Detail & Related papers (2025-02-05T22:30:00Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Efficient Private SCO for Heavy-Tailed Data via Averaged Clipping [40.69950711262191]
We consider differentially private convex optimization for heavy-tailed data with the guarantee of being differentially private (DP)
We establish new convergence results and improved complexity bounds for the proposed algorithm called AClipped-dpSGD for constrained and unconstrained convex problems.
arXiv Detail & Related papers (2022-06-27T01:39:15Z) - Differentially Private Sampling from Rashomon Sets, and the Universality
of Langevin Diffusion for Convex Optimization [15.404265455635587]
We provide an algorithm for sampling from the exponential mechanism, whose privacy analysis does not depend on convexity and which can be stopped at anytime without compromising privacy.
We obtain optimal excess empirical and population risk guarantees for (strongly) convex losses under both pure and approximate differential privacy (DP)
The framework allows us to design a DP uniform sampler from the Rashomon set.
arXiv Detail & Related papers (2022-04-04T15:33:21Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Differentially Private Coordinate Descent for Composite Empirical Risk
Minimization [13.742100810492014]
Machine learning models can leak information about the data used to train them.
Differentially Private (DP) variants of optimization algorithms like Gradient Descent (DP-SGD) have been designed to mitigate this.
We propose a new method for composite Differentially Private Empirical Risk Minimization (DP-ERM): Differentially Private Coordinate Descent (DP-CD)
arXiv Detail & Related papers (2021-10-22T10:22:48Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.