Differentially Private Release and Learning of Threshold Functions
- URL: http://arxiv.org/abs/1504.07553v2
- Date: Thu, 19 Dec 2024 23:22:37 GMT
- Title: Differentially Private Release and Learning of Threshold Functions
- Authors: Mark Bun, Kobbi Nissim, Uri Stemmer, Salil Vadhan,
- Abstract summary: We prove new upper and lower bounds on the sample complexity of $(epsilon, delta)$ differentially private algorithms.
A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y le x$, and evaluates to $0$ otherwise.
- Score: 27.612916837481485
- License:
- Abstract: We prove new upper and lower bounds on the sample complexity of $(\epsilon, \delta)$ differentially private algorithms for releasing approximate answers to threshold functions. A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y \le x$, and evaluates to $0$ otherwise. We give the first nontrivial lower bound for releasing thresholds with $(\epsilon,\delta)$ differential privacy, showing that the task is impossible over an infinite domain $X$, and moreover requires sample complexity $n \ge \Omega(\log^*|X|)$, which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with $n \le 2^{(1+ o(1))\log^*|X|}$ samples. This improves the previous best upper bound of $8^{(1 + o(1))\log^*|X|}$ (Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with $(\epsilon,\delta)$ differential privacy and learning without privacy. For properly learning thresholds in $\ell$ dimensions, this lower bound extends to $n \ge \Omega(\ell \cdot \log^*|X|)$. To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database $D$ of elements from $X$, the interior point problem asks for an element between the smallest and largest elements in $D$. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.
Related papers
- Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis [25.476062424924713]
We propose a general framework for proving fingerprinting type lower bounds, that allows us to tailor the technique to the geometry of the query set.
We show that any (sample- and population-)accurate algorithm for answering $Q$ arbitrary adaptive counting queries over a universe $mathcalX$ to accuracy $alpha$ needs $Omega(fracsqrt log|mathcalX| log (1/delta) log Qvarepsilonalpha2)$ samples, matching known upper bounds up to constants
arXiv Detail & Related papers (2024-12-18T23:11:07Z) - Tight Bounds for Quantum Phase Estimation and Related Problems [0.90238471756546]
We show that a phase estimation algorithm with precision $delta$ and error probability $epsilon$ has cost $Omegaleft(frac1deltalog1epsilonright)$.
We also show that having lots advice (applications of the advice-preparing unitary) does not significantly reduce cost, and neither does knowledge of the eigenbasis of $U$.
arXiv Detail & Related papers (2023-05-08T17:46:01Z) - \~Optimal Differentially Private Learning of Thresholds and
Quasi-Concave Optimization [23.547605262139957]
The problem of learning threshold functions is a fundamental one in machine learning.
We provide a nearly tight upper bound of $tildeO(log* |X|)1.5)$ by Kaplan et al.
We also provide matching upper and lower bounds of $tildeTheta(2log*|X|)$ for the additive error of private quasi-concave (a related and more general problem)
arXiv Detail & Related papers (2022-11-11T18:16:46Z) - Superpolynomial Lower Bounds for Decision Tree Learning and Testing [5.117810469621253]
We prove, under randomized ETH, superpolynomial lower bounds for two basic problems.
Our results imply new lower bounds for distribution-free PAC learning and testing of decision trees.
We show our lower bound for learning decision trees can be improved to $nOmega(log s)$.
arXiv Detail & Related papers (2022-10-12T16:25:48Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Improved Sample Complexity for Incremental Autonomous Exploration in
MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps.
DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z) - $Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Private Learning of Halfspaces: Simplifying the Construction and
Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$.
The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.