Related papers: Gradient Descent with Linearly Correlated Noise: Theory and Applications to Differential Privacy

Gradient Descent with Linearly Correlated Noise: Theory and Applications to Differential Privacy

URL: http://arxiv.org/abs/2302.01463v3
Date: Mon, 15 Jan 2024 17:27:40 GMT
Title: Gradient Descent with Linearly Correlated Noise: Theory and Applications to Differential Privacy
Authors: Anastasia Koloskova, Ryan McKenna, Zachary Charles, Keith Rush, Brendan McMahan
Abstract summary: We study gradient descent under linearly correlated noise. We use our results to develop new, effective matrix factorizations for differentially private optimization.
Score: 17.81999485513265
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study gradient descent under linearly correlated noise. Our work is motivated by recent practical methods for optimization with differential privacy (DP), such as DP-FTRL, which achieve strong performance in settings where privacy amplification techniques are infeasible (such as in federated learning). These methods inject privacy noise through a matrix factorization mechanism, making the noise linearly correlated over iterations. We propose a simplified setting that distills key facets of these methods and isolates the impact of linearly correlated noise. We analyze the behavior of gradient descent in this setting, for both convex and non-convex functions. Our analysis is demonstrably tighter than prior work and recovers multiple important special cases exactly (including anticorrelated perturbed gradient descent). We use our results to develop new, effective matrix factorizations for differentially private optimization, and highlight the benefits of these factorizations theoretically and empirically.

Related papers

The Gaussian Mixing Mechanism: Renyi Differential Privacy via Gaussian Sketches [13.972860872034525]
In this work, we revisit this operation through the lens of Renyi Differential Privacy (RDP)<n>We demonstrate how this improved analysis leads to performance improvement in different linear regression settings.<n> Empirically, our methods improve performance across multiple datasets and, in several cases, reduce runtime.
arXiv Detail & Related papers (2025-05-30T13:52:48Z)
Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
The Cost of Shuffling in Private Gradient Based Optimization [40.31928071333575]
We show that data shuffling results in worse empirical excess risk for textitDP-ShuffleG compared to DP-SGD.<n>We propose textitInterleaved-ShuffleG, a hybrid approach that integrates public data samples in private optimization.
arXiv Detail & Related papers (2025-02-05T22:30:00Z)
Privacy without Noisy Gradients: Slicing Mechanism for Generative Model Training [10.229653770070202]
Training generative models with differential privacy (DP) typically involves injecting noise into gradient updates or adapting the discriminator's training procedure. We consider the slicing privacy mechanism that injects noise into random low-dimensional projections of the private data. We present a kernel-based estimator for this divergence, circumventing the need for adversarial training.
arXiv Detail & Related papers (2024-10-25T19:32:58Z)
Optimal Rates and Saturation for Noiseless Kernel Ridge Regression [4.585021053685196]
We present a comprehensive study of Kernel ridge regression (KRR) in the noiseless regime. KRR is a fundamental method for learning functions from finite samples. We introduce a refined notion of degrees of freedom, which we believe has broader applicability in the analysis of kernel methods.
arXiv Detail & Related papers (2024-02-24T04:57:59Z)
Correlated Noise Provably Beats Independent Noise for Differentially Private Learning [25.81442865194914]
Differentially private learning algorithms inject noise into the learning process. We show how correlated noise provably improves upon vanilla-SGD as a function of problem parameters.
arXiv Detail & Related papers (2023-10-10T16:48:18Z)
Online Sensitivity Optimization in Differentially Private Learning [8.12606646175019]
We present a novel approach to dynamically optimize the clipping threshold. We treat this threshold as an additional learnable parameter, establishing a clean relationship between the threshold and the cost function. Our method is thoroughly assessed against alternative fixed and adaptive strategies across diverse datasets, tasks, model dimensions, and privacy levels.
arXiv Detail & Related papers (2023-10-02T00:30:49Z)
Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles [34.32021888691789]
We develop a theory of feature-bagging in noisy least-squares ridge ensembles. We demonstrate that subsampling shifts the double-descent peak of a linear predictor. We compare the performance of a feature-subsampling ensemble to a single linear predictor.
arXiv Detail & Related papers (2023-07-06T17:56:06Z)
Local Graph-homomorphic Processing for Privatized Distributed Systems [57.14673504239551]
We show that the added noise does not affect the performance of the learned model. This is a significant improvement to previous works on differential privacy for distributed algorithms.
arXiv Detail & Related papers (2022-10-26T10:00:14Z)
Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z)
Partial Identification with Noisy Covariates: A Robust Optimization Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates. We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification. Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z)
Adaptive Differentially Private Empirical Risk Minimization [95.04948014513226]
We propose an adaptive (stochastic) gradient perturbation method for differentially private empirical risk minimization. We prove that the ADP method considerably improves the utility guarantee compared to the standard differentially private method in which vanilla random noise is added.
arXiv Detail & Related papers (2021-10-14T15:02:20Z)
An automatic differentiation system for the age of differential privacy [65.35244647521989]
Tritium is an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML) We introduce Tritium, an automatic differentiation-based sensitivity analysis framework for differentially private (DP) machine learning (ML)
arXiv Detail & Related papers (2021-09-22T08:07:42Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Beyond variance reduction: Understanding the true impact of baselines on policy optimization [24.09670734037029]
We show that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates. We present theoretical results showing that, at least for bandit problems, curvature and noise are not sufficient to explain the learning dynamics.
arXiv Detail & Related papers (2020-08-31T17:52:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.