Differentially Private Learning with Per-Sample Adaptive Clipping
- URL: http://arxiv.org/abs/2212.00328v3
- Date: Tue, 2 May 2023 04:35:26 GMT
- Title: Differentially Private Learning with Per-Sample Adaptive Clipping
- Authors: Tianyu Xia and Shuheng Shen and Su Yao and Xinyi Fu and Ke Xu and
Xiaolong Xu and Xing Fu
- Abstract summary: We propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function.
We show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
- Score: 8.401653565794353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Privacy in AI remains a topic that draws attention from researchers and the
general public in recent years. As one way to implement privacy-preserving AI,
differentially private learning is a framework that enables AI models to use
differential privacy (DP). To achieve DP in the learning process, existing
algorithms typically limit the magnitude of gradients with a constant clipping,
which requires carefully tuned due to its significant impact on model
performance. As a solution to this issue, latest works NSGD and Auto-S
innovatively propose to use normalization instead of clipping to avoid
hyperparameter tuning. However, normalization-based approaches like NSGD and
Auto-S rely on a monotonic weight function, which imposes excessive weight on
small gradient samples and introduces extra deviation to the update. In this
paper, we propose a Differentially Private Per-Sample Adaptive Clipping
(DP-PSAC) algorithm based on a non-monotonic adaptive weight function, which
guarantees privacy without the typical hyperparameter tuning process of using a
constant clipping while significantly reducing the deviation between the update
and true batch-averaged gradient. We provide a rigorous theoretical convergence
analysis and show that with convergence rate at the same order, the proposed
algorithm achieves a lower non-vanishing bound, which is maintained over
training iterations, compared with NSGD/Auto-S. In addition, through extensive
experimental evaluation, we show that DP-PSAC outperforms or matches the
state-of-the-art methods on multiple main-stream vision and language tasks.
Related papers
- Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants.
There is no differentially private distributed method for training, non-feedback problems.
We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z) - Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.
Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.
We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z) - On the Convergence of DP-SGD with Adaptive Clipping [56.24689348875711]
Gradient Descent with gradient clipping is a powerful technique for enabling differentially private optimization.
This paper provides the first comprehensive convergence analysis of SGD with quantile clipping (QC-SGD)
We show how QC-SGD suffers from a bias problem similar to constant-threshold clipped SGD but can be mitigated through a carefully designed quantile and step size schedule.
arXiv Detail & Related papers (2024-12-27T20:29:47Z) - Enhancing DP-SGD through Non-monotonous Adaptive Scaling Gradient Weight [15.139854970044075]
We introduce Differentially Private Per-sample Adaptive Scaling Clipping (DP-PSASC)
This approach replaces traditional clipping with non-monotonous adaptive gradient scaling.
Our theoretical and empirical analyses confirm that DP-PSASC preserves gradient privacy and delivers superior performance across diverse datasets.
arXiv Detail & Related papers (2024-11-05T12:47:30Z) - DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction [57.83978915843095]
This paper introduces DiSK, a novel framework designed to significantly enhance the performance of differentially private gradients.
To ensure practicality for large-scale training, we simplify the Kalman filtering process, minimizing its memory and computational demands.
arXiv Detail & Related papers (2024-10-04T19:30:39Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - The importance of feature preprocessing for differentially private
linear optimization [38.125699428109826]
One of the most popular algorithms for training differentially private models is differentially private gradient descent (DPSGD)
We show that even for the simple case of linear classification, unlike non-private optimization, (private) feature preprocessing is vital for differentially private optimization.
We propose an algorithm called DPSGDF, which combines DPSGD with feature preprocessing and prove that for classification tasks, it incurs an optimality gap proportional to the diameter of the features.
arXiv Detail & Related papers (2023-07-19T20:20:52Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Automatic Clipping: Differentially Private Deep Learning Made Easier and
Stronger [39.93710312222771]
Per-example clipping is a key algorithmic step that enables practical differential private (DP) training for deep learning models.
We propose an easy-to-use replacement, called automatic clipping, that eliminates the need to tune R for any DPs.
arXiv Detail & Related papers (2022-06-14T19:49:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.