Faster Differentially Private Top-$k$ Selection: A Joint Exponential Mechanism with Pruning
- URL: http://arxiv.org/abs/2411.09552v1
- Date: Thu, 14 Nov 2024 16:06:13 GMT
- Title: Faster Differentially Private Top-$k$ Selection: A Joint Exponential Mechanism with Pruning
- Authors: Hao WU, Hanwen Zhang,
- Abstract summary: We study the differentially private top-$k$ selection problem, aiming to identify a sequence of $k$ items with approximately the highest scores from $d$ items.
Recent work by Gillenwater et al. (ICML '22) employs a direct sampling approach from the vast collection of $d,Theta(k)$ possible length-$k$ sequences.
We present an improved algorithm with time and space complexity $O(d + k2 / epsilon cdot ln d)$, where $epsilon$ denotes the privacy
- Score: 2.2083091880368855
- License:
- Abstract: We study the differentially private top-$k$ selection problem, aiming to identify a sequence of $k$ items with approximately the highest scores from $d$ items. Recent work by Gillenwater et al. (ICML '22) employs a direct sampling approach from the vast collection of $d^{\,\Theta(k)}$ possible length-$k$ sequences, showing superior empirical accuracy compared to previous pure or approximate differentially private methods. Their algorithm has a time and space complexity of $\tilde{O}(dk)$. In this paper, we present an improved algorithm with time and space complexity $O(d + k^2 / \epsilon \cdot \ln d)$, where $\epsilon$ denotes the privacy parameter. Experimental results show that our algorithm runs orders of magnitude faster than their approach, while achieving similar empirical accuracy.
Related papers
- Perturb-and-Project: Differentially Private Similarities and Marginals [73.98880839337873]
We revisit the input perturbations framework for differential privacy where noise is added to the input $Ain mathcalS$.
We first design novel efficient algorithms to privately release pair-wise cosine similarities.
We derive a novel algorithm to compute $k$-way marginal queries over $n$ features.
arXiv Detail & Related papers (2024-06-07T12:07:16Z) - Efficiently Computing Similarities to Private Datasets [19.99000806126529]
Many methods in differentially private model training rely on computing the similarity between a query point (such as public or synthetic data) and private data.
We study the following fundamental algorithmic problem: Given a similarity function $f$ and a large high-dimensional private dataset $X subset mathbbRd$, output a differentially private (DP) data structure which approximates $sum_x in X f(x,y)$ for any query $y$.
arXiv Detail & Related papers (2024-03-13T19:19:19Z) - A Scalable Algorithm for Individually Fair K-means Clustering [77.93955971520549]
We present a scalable algorithm for the individually fair ($p$, $k$)-clustering problem introduced by Jung et al. and Mahabadi et al.
A clustering is then called individually fair if it has centers within distance $delta(x)$ of $x$ for each $xin P$.
We show empirically that not only is our algorithm much faster than prior work, but it also produces lower-cost solutions.
arXiv Detail & Related papers (2024-02-09T19:01:48Z) - Run Time Bounds for Integer-Valued OneMax Functions [0.9012198585960443]
We consider the search space $mathbbZn$ in terms of multi-valued decision variables $0,ldots,r-1n$.
We show that RLS with step size adaptation achieves an optimization time of $Theta(n cdot log(|a|_1))$.
We conclude with an empirical analysis, comparing the above algorithms also with a variant of CMA-ES for discrete search spaces.
arXiv Detail & Related papers (2023-07-21T18:49:06Z) - Private estimation algorithms for stochastic block models and mixture
models [63.07482515700984]
General tools for designing efficient private estimation algorithms.
First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Faster Differentially Private Samplers via R\'enyi Divergence Analysis
of Discretized Langevin MCMC [35.050135428062795]
Langevin dynamics-based algorithms offer much faster alternatives under some distance measures such as statistical distance.
Our techniques simple and generic and apply to underdamped Langevin dynamics.
arXiv Detail & Related papers (2020-10-27T22:52:45Z) - Maximizing Determinants under Matroid Constraints [69.25768526213689]
We study the problem of finding a basis $S$ of $M$ such that $det(sum_i in Sv_i v_i v_itop)$ is maximized.
This problem appears in a diverse set of areas such as experimental design, fair allocation of goods, network design, and machine learning.
arXiv Detail & Related papers (2020-04-16T19:16:38Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.