GReDP: A More Robust Approach for Differential Private Training with Gradient-Preserving Noise Reduction
- URL: http://arxiv.org/abs/2409.11663v2
- Date: Sat, 21 Sep 2024 04:06:00 GMT
- Title: GReDP: A More Robust Approach for Differential Private Training with Gradient-Preserving Noise Reduction
- Authors: Haodi Wang, Tangyu Jiang, Yu Guo, Chengjun Cai, Cong Wang, Xiaohua Jia,
- Abstract summary: We present a more robust approach for Differential Privacy training called GReDP.
We compute the model gradients in the frequency domain and adopt a new approach to reduce the noise level.
Unlike the previous work, our GReDP only requires half of the noise scale compared to DPSGD.
- Score: 19.683286866372832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models have been extensively adopted in various regions due to their ability to represent hierarchical features, which highly rely on the training set and procedures. Thus, protecting the training process and deep learning algorithms is paramount in privacy preservation. Although Differential Privacy (DP) as a powerful cryptographic primitive has achieved satisfying results in deep learning training, the existing schemes still fall short in preserving model utility, i.e., they either invoke a high noise scale or inevitably harm the original gradients. To address the above issues, in this paper, we present a more robust approach for DP training called GReDP. Specifically, we compute the model gradients in the frequency domain and adopt a new approach to reduce the noise level. Unlike the previous work, our GReDP only requires half of the noise scale compared to DPSGD [1] while keeping all the gradient information intact. We present a detailed analysis of our method both theoretically and empirically. The experimental results show that our GReDP works consistently better than the baselines on all models and training settings.
Related papers
- Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training [31.559864332056648]
We propose a generic differential privacy framework with heterogeneous noise (DP-Hero)
Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters.
We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
arXiv Detail & Related papers (2024-09-05T08:40:54Z) - DOPPLER: Differentially Private Optimizers with Low-pass Filter for Privacy Noise Reduction [47.65999101635902]
Differentially private (DP) training prevents the leakage of sensitive information in the collected training data from trained machine learning models.
We develop a new component, called DOPPLER, which works by effectively amplifying the gradient while DP noise within this frequency domain.
Our experiments show that the proposed DPs with a lowpass filter outperform their counterparts without the filter by 3%-10% in test accuracy.
arXiv Detail & Related papers (2024-08-24T04:27:07Z) - CDSA: Conservative Denoising Score-based Algorithm for Offline Reinforcement Learning [25.071018803326254]
Distribution shift is a major obstacle in offline reinforcement learning.
Previous conservative offline RL algorithms struggle to generalize to unseen actions.
We propose to use the gradient fields of the dataset density generated from a pre-trained offline RL algorithm to adjust the original actions.
arXiv Detail & Related papers (2024-06-11T17:59:29Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - DP-SGD with weight clipping [1.0878040851638]
We present a novel approach that mitigates the bias arising from traditional gradient clipping.
By leveraging a public upper bound of the Lipschitz value of the current model and its current location within the search domain, we can achieve refined noise level adjustments.
arXiv Detail & Related papers (2023-10-27T09:17:15Z) - Adap DP-FL: Differentially Private Federated Learning with Adaptive
Noise [30.005017338416327]
Federated learning seeks to address the issue of isolated data islands by making clients disclose only their local training models.
Recently, differential privacy has been applied to federated learning to protect data privacy, but the noise added may degrade the learning performance much.
We propose a differentially private scheme for federated learning with adaptive noise (Adap DP-FL)
arXiv Detail & Related papers (2022-11-29T03:20:40Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Large Language Models Can Be Strong Differentially Private Learners [70.0317718115406]
Differentially Private (DP) learning has seen limited success for building large deep learning models of text.
We show that this performance drop can be mitigated with the use of large pretrained models.
We propose a memory saving technique that allows clipping in DP-SGD to run without instantiating per-example gradients.
arXiv Detail & Related papers (2021-10-12T01:45:27Z) - Improving Deep Learning with Differential Privacy using Gradient
Encoding and Denoising [36.935465903971014]
In this paper, we aim at training deep learning models with differential privacy guarantees.
Our key technique is to encode gradients to map them to a smaller vector space.
We show that our mechanism outperforms the state-of-the-art DPSGD.
arXiv Detail & Related papers (2020-07-22T16:33:14Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.