Private Query Release Assisted by Public Data
- URL: http://arxiv.org/abs/2004.10941v1
- Date: Thu, 23 Apr 2020 02:46:37 GMT
- Title: Private Query Release Assisted by Public Data
- Authors: Raef Bassily, Albert Cheu, Shay Moran, Aleksandar Nikolov, Jonathan
Ullman, Zhiwei Steven Wu
- Abstract summary: We study the problem of differentially private query release assisted by access to public data.
We show that we can solve the problem for any query class $mathcalH$ of finite VC-dimension using only $d/alpha$ public samples and $sqrtpd3/2/alpha2$ private samples.
- Score: 96.6174729958211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of differentially private query release assisted by
access to public data. In this problem, the goal is to answer a large class
$\mathcal{H}$ of statistical queries with error no more than $\alpha$ using a
combination of public and private samples. The algorithm is required to satisfy
differential privacy only with respect to the private samples. We study the
limits of this task in terms of the private and public sample complexities.
First, we show that we can solve the problem for any query class
$\mathcal{H}$ of finite VC-dimension using only $d/\alpha$ public samples and
$\sqrt{p}d^{3/2}/\alpha^2$ private samples, where $d$ and $p$ are the
VC-dimension and dual VC-dimension of $\mathcal{H}$, respectively. In
comparison, with only private samples, this problem cannot be solved even for
simple query classes with VC-dimension one, and without any private samples, a
larger public sample of size $d/\alpha^2$ is needed. Next, we give sample
complexity lower bounds that exhibit tight dependence on $p$ and $\alpha$. For
the class of decision stumps, we give a lower bound of $\sqrt{p}/\alpha$ on the
private sample complexity whenever the public sample size is less than
$1/\alpha^2$. Given our upper bounds, this shows that the dependence on
$\sqrt{p}$ is necessary in the private sample complexity. We also give a lower
bound of $1/\alpha$ on the public sample complexity for a broad family of query
classes, which by our upper bound, is tight in $\alpha$.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Public-data Assisted Private Stochastic Optimization: Power and
Limitations [30.298342283075172]
We study the limits and capability of public-data assisted differentially private (PA-DP) algorithms.
For complete/labeled public data, we show that any $tildeOmegabig(minbigfrac1sqrtn+fracsqrtdnepsilon big big)$ has excess risk.
We also study PA-DP supervised learning with textitunlabeled public samples.
arXiv Detail & Related papers (2024-03-06T17:06:11Z) - Private Distribution Learning with Public Data: The View from Sample
Compression [15.626115475759713]
We study the problem of private distribution learning with access to public data.
We show that the public-private learnability of a class $mathcal Q$ is connected to the existence of a sample compression scheme.
arXiv Detail & Related papers (2023-08-11T17:15:12Z) - Reaching Goals is Hard: Settling the Sample Complexity of the Stochastic
Shortest Path [106.37656068276902]
We study the sample complexity of learning an $epsilon$-optimal policy in the Shortest Path (SSP) problem.
We derive complexity bounds when the learner has access to a generative model.
We show that there exists a worst-case SSP instance with $S$ states, $A$ actions, minimum cost $c_min$, and maximum expected cost of the optimal policy over all states $B_star$.
arXiv Detail & Related papers (2022-10-10T18:34:32Z) - Robust Estimation of Discrete Distributions under Local Differential
Privacy [1.52292571922932]
We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint.
We show that combining the two constraints leads to a minimax estimation rate of $epsilonsqrtd/alpha2 k+sqrtd2/alpha2 kn$ up to a $sqrtlog (1/epsilon)$ factor.
arXiv Detail & Related papers (2022-02-14T15:59:02Z) - User-Level Private Learning via Correlated Sampling [49.453751858361265]
We consider the setting where each user holds $m$ samples and the privacy protection is enforced at the level of each user's data.
We show that, in this setting, we may learn with a much fewer number of users.
arXiv Detail & Related papers (2021-10-21T15:33:53Z) - Private Learning of Halfspaces: Simplifying the Construction and
Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$.
The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Private Mean Estimation of Heavy-Tailed Distributions [10.176795938619417]
We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments.
We show that $n = Thetaleft(frac1alpha2 + frac1alphafrackk-1varepsilonright)$ samples are necessary and sufficient to estimate the mean to $alpha$-accuracy under $varepsilon$-differential privacy, or any of its common relaxations.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.