Related papers: Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies

Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies

URL: http://arxiv.org/abs/2210.06140v2
Date: Fri, 21 Apr 2023 13:12:44 GMT
Title: Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies
Authors: Zhanyu Wang, Guang Cheng, Jordan Awan
Abstract summary: Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. We examine a DP bootstrap procedure that releases multiple private bootstrap estimates to infer the sampling distribution and construct confidence intervals (CIs) Our privacy analysis presents new results on the privacy cost of a single DP bootstrap estimate, applicable to any DP mechanisms, and identifies some misapplications of the bootstrap in the existing literature.
Score: 28.95350475681164
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. Despite the availability of numerous DP tools, there remains a lack of general techniques for conducting statistical inference under DP. We examine a DP bootstrap procedure that releases multiple private bootstrap estimates to infer the sampling distribution and construct confidence intervals (CIs). Our privacy analysis presents new results on the privacy cost of a single DP bootstrap estimate, applicable to any DP mechanisms, and identifies some misapplications of the bootstrap in the existing literature. Using the Gaussian-DP (GDP) framework (Dong et al.,2022), we show that the release of $B$ DP bootstrap estimates from mechanisms satisfying $(\mu/\sqrt{(2-2/\mathrm{e})B})$-GDP asymptotically satisfies $\mu$-GDP as $B$ goes to infinity. Moreover, we use deconvolution with the DP bootstrap estimates to accurately infer the sampling distribution, which is novel in DP. We derive CIs from our density estimate for tasks such as population mean estimation, logistic regression, and quantile regression, and we compare them to existing methods using simulations and real-world experiments on 2016 Canada Census data. Our private CIs achieve the nominal coverage level and offer the first approach to private inference for quantile regression.

Related papers

Private Statistical Estimation via Truncation [2.3910125679710665]
We introduce a novel framework for differentially private statistical estimation via data truncation, addressing a key challenge in DP estimation when the data support is unbounded.<n>By leveraging techniques from truncated statistics, we develop computationally efficient DP estimators for exponential family distributions.
arXiv Detail & Related papers (2025-05-18T20:38:38Z)
Gaussian Differential Private Bootstrap by Subsampling [1.0742675209112622]
We propose a private empirical $m$ out of $n$ bootstrap and validate its consistency and privacy guarantees under Differential Privacy.<n>Compared to the the private $n$ out of $n$ bootstrap, our approach has several advantages. First, it comes with less computational costs, in particular for massive data.
arXiv Detail & Related papers (2025-05-02T11:40:50Z)
Scalable DP-SGD: Shuffling vs. Poisson Subsampling [61.19794019914523]
We provide new lower bounds on the privacy guarantee of the multi-epoch Adaptive Linear Queries (ABLQ) mechanism with shuffled batch sampling. We show substantial gaps when compared to Poisson subsampling; prior analysis was limited to a single epoch. We introduce a practical approach to implement Poisson subsampling at scale using massively parallel computation.
arXiv Detail & Related papers (2024-11-06T19:06:16Z)
Statistical Inference for Privatized Data with Unknown Sample Size [7.933465724913661]
We develop theory and algorithms to analyze privatized data in the unbounded differential privacy(DP)<n>We show that the distance between the sampling distributions under unbounded DP and unbounded DP goes to zero as the sample size $n goes to infinity.
arXiv Detail & Related papers (2024-06-10T13:03:20Z)
Private Mean Estimation with Person-Level Differential Privacy [6.621676316292624]
We study person-level differentially private mean estimation in the case where each person holds multiple samples. We give computationally efficient algorithms under approximate-DP and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP.
arXiv Detail & Related papers (2024-05-30T18:20:35Z)
How Private are DP-SGD Implementations? [61.19794019914523]
We show that there can be a substantial gap between the privacy analysis when using the two types of batch sampling. Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
arXiv Detail & Related papers (2024-03-26T13:02:43Z)
Resampling methods for private statistical inference [1.8110941972682346]
We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data. For a fixed differential privacy parameter $epsilon$, our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$.
arXiv Detail & Related papers (2024-02-11T08:59:02Z)
DPpack: An R Package for Differentially Private Statistical Analysis and Machine Learning [3.5966786737142304]
Differential privacy (DP) is the state-of-the-art framework for guaranteeing privacy for individuals when releasing aggregated statistics or building statistical/machine learning models from data. We develop the open-source R package DPpack that provides a large toolkit of differentially private analysis.
arXiv Detail & Related papers (2023-09-19T23:36:11Z)
Differentially Private Statistical Inference through $\beta$-Divergence One Posterior Sampling [2.8544822698499255]
We propose a posterior sampling scheme from a generalised posterior targeting the minimisation of the $beta$-divergence between the model and the data generating process. This provides private estimation that is generally applicable without requiring changes to the underlying model. We show that $beta$D-Bayes produces more precise inference estimation for the same privacy guarantees.
arXiv Detail & Related papers (2023-07-11T12:00:15Z)
Recycling Scraps: Improving Private Learning by Leveraging Intermediate Checkpoints [20.533039211835902]
We design a general framework that uses aggregates of intermediate checkpoints emphduring training to increase the accuracy of DP ML techniques. We demonstrate that training over aggregates can provide significant gains in prediction accuracy over the existing state-of-the-art for StackOverflow, CIFAR10 and CIFAR100 datasets. Our methods achieve relative improvements of 0.54% and 62.6% in terms of utility and variance, on a proprietary, production-grade pCVR task.
arXiv Detail & Related papers (2022-10-04T19:21:00Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
Optimal Membership Inference Bounds for Adaptive Composition of Sampled Gaussian Mechanisms [93.44378960676897]
Given a trained model and a data sample, membership-inference (MI) attacks predict whether the sample was in the model's training set. A common countermeasure against MI attacks is to utilize differential privacy (DP) during model training to mask the presence of individual examples. In this paper, we derive bounds for the textitadvantage of an adversary mounting a MI attack, and demonstrate tightness for the widely-used Gaussian mechanism.
arXiv Detail & Related papers (2022-04-12T22:36:56Z)
On the Practicality of Differential Privacy in Federated Learning by Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively. Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks. Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z)
Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization. We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.