DP-$λ$CGD: Efficient Noise Correlation for Differentially Private Model Training
- URL: http://arxiv.org/abs/2601.22334v1
- Date: Thu, 29 Jan 2026 21:21:34 GMT
- Title: DP-$λ$CGD: Efficient Noise Correlation for Differentially Private Model Training
- Authors: Nikita P. Kalinin, Ryan McKenna, Rasmus Pagh, Christoph H. Lampert,
- Abstract summary: We propose a new noise correlation strategy that correlates noise only with the immediately preceding iteration and cancels a controlled portion of it.<n>Our method relies on noise regeneration using a pseudorandom noise generator, eliminating the need to store past noise.<n>We show that the computational overhead is minimal and empirically demonstrate improved accuracy over DP-SGD.
- Score: 30.807442477789447
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differentially private stochastic gradient descent (DP-SGD) is the gold standard for training machine learning models with formal differential privacy guarantees. Several recent extensions improve its accuracy by introducing correlated noise across training iterations. Matrix factorization mechanisms are a prominent example, but they correlate noise across many iterations and require storing previously added noise vectors, leading to substantial memory overhead in some settings. In this work, we propose a new noise correlation strategy that correlates noise only with the immediately preceding iteration and cancels a controlled portion of it. Our method relies on noise regeneration using a pseudorandom noise generator, eliminating the need to store past noise. As a result, it requires no additional memory beyond standard DP-SGD. We show that the computational overhead is minimal and empirically demonstrate improved accuracy over DP-SGD.
Related papers
- Mitigating the Noise Shift for Denoising Generative Models via Noise Awareness Guidance [54.88271057438763]
Noise Awareness Guidance (NAG) is a correction method that explicitly steers sampling trajectories to remain consistent with the pre-defined noise schedule.<n>NAG consistently mitigates noise shift and substantially improves the generation quality of mainstream diffusion models.
arXiv Detail & Related papers (2025-10-14T13:31:34Z) - Cocoon: A System Architecture for Differentially Private Training with Correlated Noises [18.01275303626406]
DP-SGD adds a noise iteration at each training, which degrades the accuracy of the trained model.<n>A new family of approaches adds carefully designed correlated noises, so that noises cancel out each other across iterations.<n>We propose Cocoon, a hardware-software co-designed framework for efficient training with correlated noises.
arXiv Detail & Related papers (2025-10-08T17:56:30Z) - Correlating Cross-Iteration Noise for DP-SGD using Model Curvature [15.566302602746843]
There is currently a large accuracy gap between DP-SGD and normal SGD training.<n>One such line of work, known as DP-MF, correlates the privacy noise across different iterations of gradient descent.<n>We propose a technique called NoiseCurve that uses model curvature, estimated from public unlabeled data, to improve the quality of this noise correlation.
arXiv Detail & Related papers (2025-10-06T22:13:02Z) - Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning [59.635264288605946]
Class Incremental Learning (CIL) aims to continuously learn new categories while retaining the knowledge of old ones.<n>Existing approaches that apply lightweight fine-tuning to backbones still induce drift.<n>We propose Mixture of Noise (Min) to mitigate the degradation of backbone generalization from adapting new tasks.
arXiv Detail & Related papers (2025-09-20T16:07:20Z) - Implicit Bias in Noisy-SGD: With Applications to Differentially Private
Training [9.618473763561418]
Training Deep Neural Networks (DNNs) with small batches using Gradient Descent (SGD) yields superior test performance compared to larger batches.
DP-SGD, used to ensure differential privacy (DP) in DNNs' training, adds Gaussian noise to the clipped gradients.
Surprisingly, large-batch training still results in a significant decrease in performance, which poses an important challenge because strong DP guarantees necessitate the use of massive batches.
arXiv Detail & Related papers (2024-02-13T10:19:33Z) - Negative Pre-aware for Noisy Cross-modal Matching [46.5591267410225]
Cross-modal noise-robust learning is a challenging task since noisy correspondence is hard to recognize and rectify.
We present a novel Negative Pre-aware Cross-modal matching solution for large visual-language model fine-tuning on noisy downstream tasks.
arXiv Detail & Related papers (2023-12-10T05:52:36Z) - Amplitude-Varying Perturbation for Balancing Privacy and Utility in
Federated Learning [86.08285033925597]
This paper presents a new DP perturbation mechanism with a time-varying noise amplitude to protect the privacy of federated learning.
We derive an online refinement of the series to prevent FL from premature convergence resulting from excessive perturbation noise.
The contribution of the new DP mechanism to the convergence and accuracy of privacy-preserving FL is corroborated, compared to the state-of-the-art Gaussian noise mechanism with a persistent noise amplitude.
arXiv Detail & Related papers (2023-03-07T22:52:40Z) - Latent Class-Conditional Noise Model [54.56899309997246]
We introduce a Latent Class-Conditional Noise model (LCCN) to parameterize the noise transition under a Bayesian framework.
We then deduce a dynamic label regression method for LCCN, whose Gibbs sampler allows us efficiently infer the latent true labels.
Our approach safeguards the stable update of the noise transition, which avoids previous arbitrarily tuning from a mini-batch of samples.
arXiv Detail & Related papers (2023-02-19T15:24:37Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Adaptive Noisy Data Augmentation for Regularized Estimation and
Inference in Generalized Linear Models [15.817569026827451]
We propose the AdaPtive Noise Augmentation (PANDA) procedure to regularize the estimation and inference of generalized linear models (GLMs)
We demonstrate the superior or similar performance of PANDA against the existing approaches of the same type of regularizers in simulated and real-life data.
arXiv Detail & Related papers (2022-04-18T22:02:37Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.