Sample-Optimal Locally Private Hypothesis Selection and the Provable
Benefits of Interactivity
- URL: http://arxiv.org/abs/2312.05645v1
- Date: Sat, 9 Dec 2023 19:22:10 GMT
- Title: Sample-Optimal Locally Private Hypothesis Selection and the Provable
Benefits of Interactivity
- Authors: Alireza F. Pour, Hassan Ashtiani, Shahab Asoodeh
- Abstract summary: We study the problem of hypothesis selection under the constraint of local differential privacy.
We devise an $varepsilon$-locally-differentially-private ($varepsilon$-LDP) algorithm that uses $Thetaleft(fracklog kalpha2min varepsilon2,1 right)$ to guarantee that $d_TV(h,hatf)leq alpha + 9 min_fin mathcalF
- Score: 8.100854060749212
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of hypothesis selection under the constraint of local
differential privacy. Given a class $\mathcal{F}$ of $k$ distributions and a
set of i.i.d. samples from an unknown distribution $h$, the goal of hypothesis
selection is to pick a distribution $\hat{f}$ whose total variation distance to
$h$ is comparable with the best distribution in $\mathcal{F}$ (with high
probability). We devise an $\varepsilon$-locally-differentially-private
($\varepsilon$-LDP) algorithm that uses $\Theta\left(\frac{k}{\alpha^2\min
\{\varepsilon^2,1\}}\right)$ samples to guarantee that $d_{TV}(h,\hat{f})\leq
\alpha + 9 \min_{f\in \mathcal{F}}d_{TV}(h,f)$ with high probability. This
sample complexity is optimal for $\varepsilon<1$, matching the lower bound of
Gopi et al. (2020). All previously known algorithms for this problem required
$\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$
samples to work.
Moreover, our result demonstrates the power of interaction for
$\varepsilon$-LDP hypothesis selection. Namely, it breaks the known lower bound
of $\Omega\left(\frac{k\log k}{\alpha^2\min \{ \varepsilon^2 ,1\}} \right)$ for
the sample complexity of non-interactive hypothesis selection. Our algorithm
breaks this barrier using only $\Theta(\log \log k)$ rounds of interaction.
To prove our results, we define the notion of \emph{critical queries} for a
Statistical Query Algorithm (SQA) which may be of independent interest.
Informally, an SQA is said to use a small number of critical queries if its
success relies on the accuracy of only a small number of queries it asks. We
then design an LDP algorithm that uses a smaller number of critical queries.
Related papers
- Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - Private High-Dimensional Hypothesis Testing [4.133655523622441]
We provide improved differentially private algorithms for identity testing of high-dimensional distributions.
Specifically, we can test whether the distribution comes from $mathcalN(mu*, Sigma)$ for some fixed $mu*$ or from some $mathcalN(mu*, Sigma)$ with total variation distance at least $alpha$.
arXiv Detail & Related papers (2022-03-03T06:25:48Z) - Sampling from Log-Concave Distributions with Infinity-Distance
Guarantees and Applications to Differentially Private Optimization [33.38289436686841]
We present an algorithm that outputs a point from a distributionO(varepsilon)$close to $$ in infinity-distance.
We also present a "soft-pi" version of the Dikin walk which may be independent interest.
arXiv Detail & Related papers (2021-11-07T13:44:50Z) - Statistically Near-Optimal Hypothesis Selection [33.83129262033921]
We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem.
This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity.
arXiv Detail & Related papers (2021-08-17T21:11:20Z) - Cascading Bandit under Differential Privacy [21.936577816668944]
We study emphdifferential privacy (DP) and emphlocal differential privacy (LDP) in cascading bandits.
Under DP, we propose an algorithm which guarantees $epsilon$-indistinguishability and a regret of $mathcalO(fraclog3 Tepsilon)$ for an arbitrarily small $xi$.
Under ($epsilon$,$delta$)-LDP, we relax the $K2$ dependence through the tradeoff between privacy budgetepsilon$ and error probability $
arXiv Detail & Related papers (2021-05-24T07:19:01Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.