Robust Estimation of Discrete Distributions under Local Differential
Privacy
- URL: http://arxiv.org/abs/2202.06825v1
- Date: Mon, 14 Feb 2022 15:59:02 GMT
- Title: Robust Estimation of Discrete Distributions under Local Differential
Privacy
- Authors: Julien Chhor and Flore Sentenac
- Abstract summary: We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint.
We show that combining the two constraints leads to a minimax estimation rate of $epsilonsqrtd/alpha2 k+sqrtd2/alpha2 kn$ up to a $sqrtlog (1/epsilon)$ factor.
- Score: 1.52292571922932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although robust learning and local differential privacy are both widely
studied fields of research, combining the two settings is an almost unexplored
topic. We consider the problem of estimating a discrete distribution in total
variation from $n$ contaminated data batches under a local differential privacy
constraint. A fraction $1-\epsilon$ of the batches contain $k$ i.i.d. samples
drawn from a discrete distribution $p$ over $d$ elements. To protect the users'
privacy, each of the samples is privatized using an $\alpha$-locally
differentially private mechanism. The remaining $\epsilon n $ batches are an
adversarial contamination. The minimax rate of estimation under contamination
alone, with no privacy, is known to be $\epsilon/\sqrt{k}+\sqrt{d/kn}$, up to a
$\sqrt{\log(1/\epsilon)}$ factor. Under the privacy constraint alone, the
minimax rate of estimation is $\sqrt{d^2/\alpha^2 kn}$. We show that combining
the two constraints leads to a minimax estimation rate of
$\epsilon\sqrt{d/\alpha^2 k}+\sqrt{d^2/\alpha^2 kn}$ up to a
$\sqrt{\log(1/\epsilon)}$ factor, larger than the sum of the two separate
rates. We provide a polynomial-time algorithm achieving this bound, as well as
a matching information theoretic lower bound.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Distribution-Aware Mean Estimation under User-level Local Differential Privacy [5.267844649650687]
We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples.
Based on a distribution-aware mean estimation algorithm, we establish an $M$-dependent upper bounds on the worst-case risk over $mu$ for the task of mean estimation.
arXiv Detail & Related papers (2024-10-12T11:57:52Z) - Private Mean Estimation with Person-Level Differential Privacy [6.621676316292624]
We study person-level differentially private mean estimation in the case where each person holds multiple samples.
We give computationally efficient algorithms under approximate-DP and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP.
arXiv Detail & Related papers (2024-05-30T18:20:35Z) - Contraction of Locally Differentially Private Mechanisms [8.547801169402923]
We derive tight bounds on the divergence between $PK$ and $QK$ output distributions of an $epsilon$-LDP mechanism $K$.
We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam's, Assouad's, and the mutual information methods.
arXiv Detail & Related papers (2022-10-24T16:38:12Z) - Frequency Estimation Under Multiparty Differential Privacy: One-shot and
Streaming [10.952006057356714]
We study the fundamental problem of frequency estimation under both privacy and communication constraints, where the data is distributed among $k$ parties.
We adopt the model of multiparty differential privacy (MDP), which is more general than local differential privacy (LDP) and (centralized) differential privacy.
Our protocols achieve optimality (up to logarithmic factors) permissible by the more stringent of the two constraints.
arXiv Detail & Related papers (2021-04-05T08:15:20Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - On Distributed Differential Privacy and Counting Distinct Elements [52.701425652208734]
We study the setup where each of $n$ users holds an element from a discrete set.
The goal is to count the number of distinct elements across all users.
arXiv Detail & Related papers (2020-09-21T04:13:34Z) - Learning discrete distributions: user vs item-level privacy [47.05234904407048]
Recently many practical applications such as federated learning require preserving privacy for all items of a single user.
We study the fundamental problem of learning discrete distributions over $k$ symbols with user-level differential privacy.
We propose a mechanism such that the number of users scales as $tildemathcalO(k/(malpha2) + k/sqrtmepsilonalpha)$ and hence the privacy penalty is $tildeTheta(sqrtm)$ times smaller compared to the standard mechanisms.
arXiv Detail & Related papers (2020-07-27T16:15:14Z) - Private Query Release Assisted by Public Data [96.6174729958211]
We study the problem of differentially private query release assisted by access to public data.
We show that we can solve the problem for any query class $mathcalH$ of finite VC-dimension using only $d/alpha$ public samples and $sqrtpd3/2/alpha2$ private samples.
arXiv Detail & Related papers (2020-04-23T02:46:37Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.