Differentially Private Quantiles
- URL: http://arxiv.org/abs/2102.08244v1
- Date: Tue, 16 Feb 2021 16:02:59 GMT
- Title: Differentially Private Quantiles
- Authors: Jennifer Gillenwater, Matthew Joseph, Alex Kulesza
- Abstract summary: We propose an instance of the exponential mechanism that simultaneously estimates $m$ quantiles from $n$ data points.
Our method significantly outperforms the current state of the art on both real and synthetic data.
- Score: 12.917299605014419
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantiles are often used for summarizing and understanding data. If that data
is sensitive, it may be necessary to compute quantiles in a way that is
differentially private, providing theoretical guarantees that the result does
not reveal private information. However, in the common case where multiple
quantiles are needed, existing differentially private algorithms scale poorly:
they compute each quantile individually, splitting their privacy budget and
thus decreasing accuracy. In this work we propose an instance of the
exponential mechanism that simultaneously estimates $m$ quantiles from $n$ data
points while guaranteeing differential privacy. The utility function is
carefully structured to allow for an efficient implementation that avoids
exponential dependence on $m$ and returns estimates of all $m$ quantiles in
time $O(mn^2 + m^2n)$. Experiments show that our method significantly
outperforms the current state of the art on both real and synthetic data while
remaining efficient enough to be practical.
Related papers
- Efficiently Computing Similarities to Private Datasets [19.99000806126529]
Many methods in differentially private model training rely on computing the similarity between a query point (such as public or synthetic data) and private data.
We study the following fundamental algorithmic problem: Given a similarity function $f$ and a large high-dimensional private dataset $X subset mathbbRd$, output a differentially private (DP) data structure which approximates $sum_x in X f(x,y)$ for any query $y$.
arXiv Detail & Related papers (2024-03-13T19:19:19Z) - Unbounded Differentially Private Quantile and Maximum Estimation [2.855485723554975]
We show that a simple invocation of $textttAboveThreshold$ can give more accurate and robust estimates on the highest quantiles.
We show that an improved analysis of $textttAboveThreshold$ can improve the privacy guarantees for the widely used Sparse Vector Technique.
arXiv Detail & Related papers (2023-05-02T03:23:07Z) - Privacy Induces Robustness: Information-Computation Gaps and Sparse Mean
Estimation [8.9598796481325]
We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems.
We establish an information-computation gap for private sparse mean estimation.
We also give evidence for privacy-induced information-computation gaps for several other statistics and learning problems.
arXiv Detail & Related papers (2022-11-01T20:03:41Z) - Smooth Anonymity for Sparse Graphs [69.1048938123063]
differential privacy has emerged as the gold standard of privacy, however, when it comes to sharing sparse datasets.
In this work, we consider a variation of $k$-anonymity, which we call smooth-$k$-anonymity, and design simple large-scale algorithms that efficiently provide smooth-$k$-anonymity.
arXiv Detail & Related papers (2022-07-13T17:09:25Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Brain Image Synthesis with Unsupervised Multivariate Canonical
CSC$\ell_4$Net [122.8907826672382]
We propose to learn dedicated features that cross both intre- and intra-modal variations using a novel CSC$ell_4$Net.
arXiv Detail & Related papers (2021-03-22T05:19:40Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy
Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data.
Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z) - Learning discrete distributions: user vs item-level privacy [47.05234904407048]
Recently many practical applications such as federated learning require preserving privacy for all items of a single user.
We study the fundamental problem of learning discrete distributions over $k$ symbols with user-level differential privacy.
We propose a mechanism such that the number of users scales as $tildemathcalO(k/(malpha2) + k/sqrtmepsilonalpha)$ and hence the privacy penalty is $tildeTheta(sqrtm)$ times smaller compared to the standard mechanisms.
arXiv Detail & Related papers (2020-07-27T16:15:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.