Differentially Private Sampling from Distributions
- URL: http://arxiv.org/abs/2211.08193v1
- Date: Tue, 15 Nov 2022 14:56:42 GMT
- Title: Differentially Private Sampling from Distributions
- Authors: Sofya Raskhodnikova, Satchit Sivakumar, Adam Smith, Marika Swanberg
- Abstract summary: In some regimes, private sampling requires fewer observations than learning a description of $P$ nonprivately.
For some classes of distributions, the overhead in the number of observations needed for private learning compared to non-private learning is completely captured by the number of observations needed for private sampling.
- Score: 1.452875650827562
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We initiate an investigation of private sampling from distributions. Given a
dataset with $n$ independent observations from an unknown distribution $P$, a
sampling algorithm must output a single observation from a distribution that is
close in total variation distance to $P$ while satisfying differential privacy.
Sampling abstracts the goal of generating small amounts of realistic-looking
data. We provide tight upper and lower bounds for the dataset size needed for
this task for three natural families of distributions: arbitrary distributions
on $\{1,\ldots ,k\}$, arbitrary product distributions on $\{0,1\}^d$, and
product distributions on $\{0,1\}^d$ with bias in each coordinate bounded away
from 0 and 1. We demonstrate that, in some parameter regimes, private sampling
requires asymptotically fewer observations than learning a description of $P$
nonprivately; in other regimes, however, private sampling proves to be as
difficult as private learning. Notably, for some classes of distributions, the
overhead in the number of observations needed for private learning compared to
non-private learning is completely captured by the number of observations
needed for private sampling.
Related papers
- Private Distribution Learning with Public Data: The View from Sample
Compression [15.626115475759713]
We study the problem of private distribution learning with access to public data.
We show that the public-private learnability of a class $mathcal Q$ is connected to the existence of a sample compression scheme.
arXiv Detail & Related papers (2023-08-11T17:15:12Z) - Differentially-Private Bayes Consistency [70.92545332158217]
We construct a Bayes consistent learning rule that satisfies differential privacy (DP)
We prove that any VC class can be privately learned in a semi-supervised setting with a near-optimal sample complexity.
arXiv Detail & Related papers (2022-12-08T11:57:30Z) - Private Estimation with Public Data [10.176795938619417]
We study differentially private (DP) estimation with access to a small amount of public data.
We show that under the constraints of pure or concentrated DP, d+1 public data samples are sufficient to remove any dependence on the range parameters of the private data distribution.
arXiv Detail & Related papers (2022-08-16T22:46:44Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Nonparametric extensions of randomized response for private confidence sets [51.75485869914048]
This work derives methods for performing nonparametric, nonasymptotic statistical inference for population means under the constraint of local differential privacy (LDP)
We present confidence intervals (CI) and time-uniform confidence sequences (CS) for $mustar$ when only given access to the privatized data.
arXiv Detail & Related papers (2022-02-17T16:04:49Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z) - Covariance-Aware Private Mean Estimation Without Private Covariance Estimation [10.036088581191592]
We present two sample-efficient differentially private mean estimators for $d$-dimensional (sub)Gaussian distributions.
Our estimators output $tildemu$ such that $| tildemu - mu |_Sigma leq alpha$, where $| cdot |_Sigma$ is the Mahalanobis distance.
arXiv Detail & Related papers (2021-06-24T21:40:07Z) - Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy
Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data.
Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z) - Compressive Privatization: Sparse Distribution Estimation under Locally
Differentially Privacy [18.43218511751587]
We show that as long as the target distribution is sparse or approximately sparse, the number of samples needed could be significantly reduced.
Our mechanism does privatization and dimensionality reduction simultaneously, and the sample complexity will only depend on the reduced dimensionality.
arXiv Detail & Related papers (2020-12-03T17:14:23Z) - On the Intrinsic Differential Privacy of Bagging [69.70602220716718]
We show that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
Our experimental results demonstrate that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
arXiv Detail & Related papers (2020-08-22T14:17:55Z) - Private Mean Estimation of Heavy-Tailed Distributions [10.176795938619417]
We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments.
We show that $n = Thetaleft(frac1alpha2 + frac1alphafrackk-1varepsilonright)$ samples are necessary and sufficient to estimate the mean to $alpha$-accuracy under $varepsilon$-differential privacy, or any of its common relaxations.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.