The Sparse Vector Technique, Revisited
- URL: http://arxiv.org/abs/2010.00917v2
- Date: Mon, 16 Nov 2020 14:15:59 GMT
- Title: The Sparse Vector Technique, Revisited
- Authors: Haim Kaplan, Yishay Mansour, Uri Stemmer
- Abstract summary: We revisit one of the most basic and widely applicable techniques in the literature of differential privacy.
This simple algorithm privately tests whether the value of a given query on a database is close to what we expect it to be.
We suggest an alternative, equally simple, algorithm that can continue testing queries as long as any single individual does not contribute to the answer of too many queries.
- Score: 67.57692396665915
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We revisit one of the most basic and widely applicable techniques in the
literature of differential privacy - the sparse vector technique [Dwork et al.,
STOC 2009]. This simple algorithm privately tests whether the value of a given
query on a database is close to what we expect it to be. It allows to ask an
unbounded number of queries as long as the answer is close to what we expect,
and halts following the first query for which this is not the case.
We suggest an alternative, equally simple, algorithm that can continue
testing queries as long as any single individual does not contribute to the
answer of too many queries whose answer deviates substantially form what we
expect. Our analysis is subtle and some of its ingredients may be more widely
applicable. In some cases our new algorithm allows to privately extract much
more information from the database than the original.
We demonstrate this by applying our algorithm to the shifting heavy-hitters
problem: On every time step, each of $n$ users gets a new input, and the task
is to privately identify all the current heavy-hitters. That is, on time step
$i$, the goal is to identify all data elements $x$ such that many of the users
have $x$ as their current input. We present an algorithm for this problem with
improved error guarantees over what can be obtained using existing techniques.
Specifically, the error of our algorithm depends on the maximal number of times
that a single user holds a heavy-hitter as input, rather than the total number
of times in which a heavy-hitter exists.
Related papers
- The Limits of Assumption-free Tests for Algorithm Performance [6.7171902258864655]
How well does an algorithm perform at a given modeling task, and which algorithm performs best?
We make a distinction between two questions: how good is an algorithm $A$ at the problem of learning from a training set of size $n$, versus, how good is a particular fitted model produced by running $A$ on a particular training data set of size $n$?
arXiv Detail & Related papers (2024-02-12T03:19:30Z) - Auditable Algorithms for Approximate Model Counting [31.609554468870382]
We develop new deterministic approximate counting algorithms that invoke a $Sigma3P$ oracle, but can be certified using a $SigmaP$ oracle on far fewer variables.
This shows for the first time how audit complexity can be traded for complexity of approximate counting.
arXiv Detail & Related papers (2023-12-19T17:47:18Z) - User-Level Differential Privacy With Few Examples Per User [73.81862394073308]
We consider the example-scarce regime, where each user has only a few examples, and obtain the following results.
For approximate-DP, we give a generic transformation of any item-level DP algorithm to a user-level DP algorithm.
We present a simple technique for adapting the exponential mechanism [McSherry, Talwar FOCS 2007] to the user-level setting.
arXiv Detail & Related papers (2023-09-21T21:51:55Z) - Unbounded Differentially Private Quantile and Maximum Estimation [2.855485723554975]
We show that a simple invocation of $textttAboveThreshold$ can give more accurate and robust estimates on the highest quantiles.
We show that an improved analysis of $textttAboveThreshold$ can improve the privacy guarantees for the widely used Sparse Vector Technique.
arXiv Detail & Related papers (2023-05-02T03:23:07Z) - Streaming Algorithms for Learning with Experts: Deterministic Versus
Robust [62.98860182111096]
In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times)
The goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set.
We show a space lower bound of $widetildeOmegaleft(fracnMRTright)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes.
arXiv Detail & Related papers (2023-03-03T04:39:53Z) - Memory Bounds for the Experts Problem [53.67419690563877]
Online learning with expert advice is a fundamental problem of sequential prediction.
The goal is to process predictions, and make a prediction with the minimum cost.
An algorithm is judged by how well it does compared to the best expert in the set.
arXiv Detail & Related papers (2022-04-21T01:22:18Z) - On Avoiding the Union Bound When Answering Multiple Differentially
Private Queries [49.453751858361265]
We give an algorithm for this task that achieves an expected $ell_infty$ error bound of $O(frac1epsilonsqrtk log frac1delta)$.
On the other hand, the algorithm of Dagan and Kur has a remarkable advantage that the $ell_infty$ error bound of $O(frac1epsilonsqrtk log frac1delta)$ holds not only in expectation but always.
arXiv Detail & Related papers (2020-12-16T17:58:45Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - Quick Streaming Algorithms for Maximization of Monotone Submodular
Functions in Linear Time [16.346069386394703]
We consider the problem of monotone, submodular over a ground set of size $n$ subject to cardinality constraint $k$.
We introduce the first deterministic algorithms with linear time complexity; these algorithms are streaming algorithms.
Our single-pass algorithm obtains a constant ratio in $lceil n / c rceil + c$ oracle queries, for any $c ge 1$.
arXiv Detail & Related papers (2020-09-10T16:35:54Z) - Differentially Private Set Union [23.001440922454407]
We study the basic operation of set union in the global model of differential privacy.
In this problem, we are given a universe $U$ of items, possibly of infinite size, and a database $D$ of users.
We want an ($epsilon$,$delta$)-differentially private algorithm which outputs a subset $S subset cup_i W_i$ such that the size of $S$ is as large as possible.
arXiv Detail & Related papers (2020-02-22T18:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.