The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
- URL: http://arxiv.org/abs/2410.06186v2
- Date: Thu, 10 Oct 2024 17:06:10 GMT
- Title: The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD
- Authors: Thomas Steinke, Milad Nasr, Arun Ganesh, Borja Balle, Christopher A. Choquette-Choo, Matthew Jagielski, Jamie Hayes, Abhradeep Guha Thakurta, Adam Smith, Andreas Terzis,
- Abstract summary: We propose a simple privacy analysis of noisy clipped gradient descent (DP-SGD)
We show experimentally that our is predictive of the outcome of privacy auditing applied to various training procedures.
We also empirically support our and show existing privacy auditing attacks are bounded by our analysis in both vision and language tasks.
- Score: 46.71175773861434
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a simple heuristic privacy analysis of noisy clipped stochastic gradient descent (DP-SGD) in the setting where only the last iterate is released and the intermediate iterates remain hidden. Namely, our heuristic assumes a linear structure for the model. We show experimentally that our heuristic is predictive of the outcome of privacy auditing applied to various training procedures. Thus it can be used prior to training as a rough estimate of the final privacy leakage. We also probe the limitations of our heuristic by providing some artificial counterexamples where it underestimates the privacy leakage. The standard composition-based privacy analysis of DP-SGD effectively assumes that the adversary has access to all intermediate iterates, which is often unrealistic. However, this analysis remains the state of the art in practice. While our heuristic does not replace a rigorous privacy analysis, it illustrates the large gap between the best theoretical upper bounds and the privacy auditing lower bounds and sets a target for further work to improve the theoretical privacy analyses. We also empirically support our heuristic and show existing privacy auditing attacks are bounded by our heuristic analysis in both vision and language tasks.
Related papers
- Auditing $f$-Differential Privacy in One Run [43.34594422920125]
Empirical auditing has emerged as a means of catching some of the flaws in the implementation of privacy-preserving algorithms.
We present a tight and efficient auditing procedure and analysis that can effectively assess the privacy of mechanisms.
arXiv Detail & Related papers (2024-10-29T17:02:22Z) - Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model [40.4617658114104]
In this work, we focus on a threat model where the adversary has access only to the final model, with no visibility into intermediate updates.
Our experiments show that this approach consistently outperforms previous attempts at auditing the hidden state model.
Our results advance the understanding of achievable privacy guarantees within this threat model.
arXiv Detail & Related papers (2024-05-23T11:38:38Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - Tight Auditing of Differentially Private Machine Learning [77.38590306275877]
For private machine learning, existing auditing mechanisms are tight.
They only give tight estimates under implausible worst-case assumptions.
We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets.
arXiv Detail & Related papers (2023-02-15T21:40:33Z) - On the Statistical Complexity of Estimation and Testing under Privacy Constraints [17.04261371990489]
We show how to characterize the power of a statistical test under differential privacy in a plug-and-play fashion.
We show that maintaining privacy results in a noticeable reduction in performance only when the level of privacy protection is very high.
Finally, we demonstrate that the DP-SGLD algorithm, a private convex solver, can be employed for maximum likelihood estimation with a high degree of confidence.
arXiv Detail & Related papers (2022-10-05T12:55:53Z) - Is Vertical Logistic Regression Privacy-Preserving? A Comprehensive
Privacy Analysis and Beyond [57.10914865054868]
We consider vertical logistic regression (VLR) trained with mini-batch descent gradient.
We provide a comprehensive and rigorous privacy analysis of VLR in a class of open-source Federated Learning frameworks.
arXiv Detail & Related papers (2022-07-19T05:47:30Z) - Smoothed Differential Privacy [55.415581832037084]
Differential privacy (DP) is a widely-accepted and widely-applied notion of privacy based on worst-case analysis.
In this paper, we propose a natural extension of DP following the worst average-case idea behind the celebrated smoothed analysis.
We prove that any discrete mechanism with sampling procedures is more private than what DP predicts, while many continuous mechanisms with sampling procedures are still non-private under smoothed DP.
arXiv Detail & Related papers (2021-07-04T06:55:45Z) - Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient
Descent [10.409652277630132]
We model the dynamics of privacy loss in Langevin diffusion and extend it to the noisy gradient descent algorithm.
We prove that the privacy loss converges exponentially fast.
arXiv Detail & Related papers (2021-02-11T05:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.