Efficient derandomization of differentially private counting queries
- URL: http://arxiv.org/abs/2510.16959v2
- Date: Tue, 21 Oct 2025 08:25:19 GMT
- Title: Efficient derandomization of differentially private counting queries
- Authors: Surendra Ghentiyala,
- Abstract summary: Differential privacy for the 2020 census required an estimated 90 terabytes of randomness [GL20], an amount which may be prohibitively expensive or entirely infeasible to generate.<n>This is the task of outputting the number of entries in a dataset that satisfy predicates $mathcalP, dots, mathcalP_d$ respectively.<n>They showed the rather surprising fact that though any reasonably accurate, $varepsilon-differentially private mechanism for one counting query requires $O(log d)$ bits of in expectation.<n>Here, we give a time mechanism which
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential privacy for the 2020 census required an estimated 90 terabytes of randomness [GL20], an amount which may be prohibitively expensive or entirely infeasible to generate. Motivated by these practical concerns, [CSV25] initiated the study of the randomness complexity of differential privacy, and in particular, the randomness complexity of $d$ counting queries. This is the task of outputting the number of entries in a dataset that satisfy predicates $\mathcal{P}_1, \dots, \mathcal{P}_d$ respectively. They showed the rather surprising fact that though any reasonably accurate, $\varepsilon$-differentially private mechanism for one counting query requires $1-O(\varepsilon)$ bits of randomness in expectation, there exists a fairly accurate mechanism for $d$ counting queries which requires only $O(\log d)$ bits of randomness in expectation. The mechanism of [CSV25] is inefficient (not polynomial time) and relies on a combinatorial object known as rounding schemes. Here, we give a polynomial time mechanism which achieves nearly the same randomness complexity versus accuracy tradeoff as that of [CSV25]. Our construction is based on the following simple observation: after a randomized shift of the answer to each counting query, the answer to many counting queries remains the same regardless of whether we add noise to that coordinate or not. This allows us to forgo the step of adding noise to the result of many counting queries. Our mechanism does not make use of rounding schemes. Therefore, it provides a different -- and, in our opinion, clearer -- insight into the origins of the randomness savings that can be obtained by batching $d$ counting queries.
Related papers
- Query-Efficient Locally Private Hypothesis Selection via the Scheffe Graph [45.975805184376036]
We describe an algorithm that satisfies local differential privacy, performs $tildeO(k3/2)$ non-adaptive queries to individuals.<n>We also introduce a new object we dub the Scheff'e graph, which captures structure of the differences between distributions in $Q$.
arXiv Detail & Related papers (2025-09-19T17:41:15Z) - The Role of Randomness in Stability [20.718747268949112]
We study the randomness complexity of two influential notions of stability in learning: replicability and differential privacy.<n>We prove a weak-to-strong' boosting theorem for stability: the randomness complexity of a task is tightly controlled by the best replication probability.
arXiv Detail & Related papers (2025-02-11T23:06:43Z) - Some Constructions of Private, Efficient, and Optimal $K$-Norm and Elliptic Gaussian Noise [54.34628844260993]
Differentially private computation often begins with a bound on some $d$-dimensional statistic's sensitivity.
For pure differential privacy, the $K$-norm mechanism can improve on this approach using a norm tailored to the statistic's sensitivity space.
This paper solves both problems for the simple statistics of sum, count, and vote.
arXiv Detail & Related papers (2023-09-27T17:09:36Z) - On the Fine-Grained Query Complexity of Symmetric Functions [4.956977275061968]
This paper explores a fine-grained version of the Watrous conjecture.
It includes the randomized and quantum algorithms with success probabilities arbitrarily close to $1/2$.
arXiv Detail & Related papers (2023-09-20T13:04:45Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - Private estimation algorithms for stochastic block models and mixture
models [63.07482515700984]
General tools for designing efficient private estimation algorithms.
First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z) - Efficient One Sided Kolmogorov Approximation [7.657378889055477]
The main application that we examine is estimation of the probability missing deadlines in series-parallel schedules.
Since exact computation of these probabilities is NP-hard, we propose to use the algorithms described in this paper to obtain an approximation.
arXiv Detail & Related papers (2022-07-14T10:03:02Z) - Fuzzy Clustering with Similarity Queries [56.96625809888241]
The fuzzy or soft objective is a popular generalization of the well-known $k$-means problem.
We show that by making few queries, the problem becomes easier to solve.
arXiv Detail & Related papers (2021-06-04T02:32:26Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - Quantum Approximate Counting with Nonadaptive Grover Iterations [1.14219428942199]
In the quantum setting, Approximate Counting can be done with $Oleft(sqrtN/epsilon, sqrtN/K/epsilonright)$ queries.
We show that algorithms using only nonadaptive Grover iterations can achieve $Oleft(sqrtN/epsilonright)$ query complexity, which is tight.
arXiv Detail & Related papers (2020-10-09T04:48:14Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.