Locally Private Hypothesis Selection
- URL: http://arxiv.org/abs/2002.09465v2
- Date: Sat, 20 Jun 2020 02:58:48 GMT
- Title: Locally Private Hypothesis Selection
- Authors: Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov,
Zhiwei Steven Wu, Huanyu Zhang
- Abstract summary: We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
- Score: 96.06118559817057
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We initiate the study of hypothesis selection under local differential
privacy. Given samples from an unknown probability distribution $p$ and a set
of $k$ probability distributions $\mathcal{Q}$, we aim to output, under the
constraints of $\varepsilon$-local differential privacy, a distribution from
$\mathcal{Q}$ whose total variation distance to $p$ is comparable to the best
such distribution. This is a generalization of the classic problem of $k$-wise
simple hypothesis testing, which corresponds to when $p \in \mathcal{Q}$, and
we wish to identify $p$. Absent privacy constraints, this problem requires
$O(\log k)$ samples from $p$, and it was recently shown that the same
complexity is achievable under (central) differential privacy. However, the
naive approach to this problem under local differential privacy would require
$\tilde O(k^2)$ samples.
We first show that the constraint of local differential privacy incurs an
exponential increase in cost: any algorithm for this problem requires at least
$\Omega(k)$ samples. Second, for the special case of $k$-wise simple hypothesis
testing, we provide a non-interactive algorithm which nearly matches this
bound, requiring $\tilde O(k)$ samples. Finally, we provide sequentially
interactive algorithms for the general case, requiring $\tilde O(k)$ samples
and only $O(\log \log k)$ rounds of interactivity. Our algorithms are achieved
through a reduction to maximum selection with adversarial comparators, a
problem of independent interest for which we initiate study in the parallel
setting. For this problem, we provide a family of algorithms for each number of
allowed rounds of interaction $t$, as well as lower bounds showing that they
are near-optimal for every $t$. Notably, our algorithms result in exponential
improvements on the round complexity of previous methods.
Related papers
- Sample-Optimal Locally Private Hypothesis Selection and the Provable
Benefits of Interactivity [8.100854060749212]
We study the problem of hypothesis selection under the constraint of local differential privacy.
We devise an $varepsilon$-locally-differentially-private ($varepsilon$-LDP) algorithm that uses $Thetaleft(fracklog kalpha2min varepsilon2,1 right)$ to guarantee that $d_TV(h,hatf)leq alpha + 9 min_fin mathcalF
arXiv Detail & Related papers (2023-12-09T19:22:10Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Succinct quantum testers for closeness and $k$-wise uniformity of probability distributions [2.3466828785520373]
We explore potential quantum speedups for the fundamental problem of testing the properties of closeness and $k$-wise uniformity of probability distributions.
We show that the quantum query complexities for $ell1$- and $ell2$-closeness testing are $O(sqrtn/varepsilon)$ and $O(sqrtnk/varepsilon)$.
We propose the first quantum algorithm for this problem with query complexity $O(sqrtnk/varepsilon)
arXiv Detail & Related papers (2023-04-25T15:32:37Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Private High-Dimensional Hypothesis Testing [4.133655523622441]
We provide improved differentially private algorithms for identity testing of high-dimensional distributions.
Specifically, we can test whether the distribution comes from $mathcalN(mu*, Sigma)$ for some fixed $mu*$ or from some $mathcalN(mu*, Sigma)$ with total variation distance at least $alpha$.
arXiv Detail & Related papers (2022-03-03T06:25:48Z) - Statistically Near-Optimal Hypothesis Selection [33.83129262033921]
We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem.
This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity.
arXiv Detail & Related papers (2021-08-17T21:11:20Z) - Optimal Testing of Discrete Distributions with High Probability [49.19942805582874]
We study the problem of testing discrete distributions with a focus on the high probability regime.
We provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors.
arXiv Detail & Related papers (2020-09-14T16:09:17Z) - The Sparse Hausdorff Moment Problem, with Application to Topic Models [5.151973524974052]
We give an algorithm for identifying a $k$-mixture using samples of $m=2k$ iid binary random variables.
It suffices to know the moments to additive accuracy $w_mincdotzetaO(k)$.
arXiv Detail & Related papers (2020-07-16T04:23:57Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.