Related papers: Differentially Private Release and Learning of Threshold Functions

Differentially Private Release and Learning of Threshold Functions

URL: http://arxiv.org/abs/1504.07553v2
Date: Thu, 19 Dec 2024 23:22:37 GMT
Title: Differentially Private Release and Learning of Threshold Functions
Authors: Mark Bun, Kobbi Nissim, Uri Stemmer, Salil Vadhan,
Abstract summary: We prove new upper and lower bounds on the sample complexity of $(epsilon, delta)$ differentially private algorithms.<n>A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y le x$, and evaluates to $0$ otherwise.
Score: 27.612916837481485
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We prove new upper and lower bounds on the sample complexity of $(\epsilon, \delta)$ differentially private algorithms for releasing approximate answers to threshold functions. A threshold function $c_x$ over a totally ordered domain $X$ evaluates to $c_x(y) = 1$ if $y \le x$, and evaluates to $0$ otherwise. We give the first nontrivial lower bound for releasing thresholds with $(\epsilon,\delta)$ differential privacy, showing that the task is impossible over an infinite domain $X$, and moreover requires sample complexity $n \ge \Omega(\log^*|X|)$, which grows with the size of the domain. Inspired by the techniques used to prove this lower bound, we give an algorithm for releasing thresholds with $n \le 2^{(1+ o(1))\log^*|X|}$ samples. This improves the previous best upper bound of $8^{(1 + o(1))\log^*|X|}$ (Beimel et al., RANDOM '13). Our sample complexity upper and lower bounds also apply to the tasks of learning distributions with respect to Kolmogorov distance and of properly PAC learning thresholds with differential privacy. The lower bound gives the first separation between the sample complexity of properly learning a concept class with $(\epsilon,\delta)$ differential privacy and learning without privacy. For properly learning thresholds in $\ell$ dimensions, this lower bound extends to $n \ge \Omega(\ell \cdot \log^*|X|)$. To obtain our results, we give reductions in both directions from releasing and properly learning thresholds and the simpler interior point problem. Given a database $D$ of elements from $X$, the interior point problem asks for an element between the smallest and largest elements in $D$. We introduce new recursive constructions for bounding the sample complexity of the interior point problem, as well as further reductions and techniques for proving impossibility results for other basic problems in differential privacy.

Related papers

Fingerprinting Codes Meet Geometry: Improved Lower Bounds for Private Query Release and Adaptive Data Analysis [25.476062424924713]
We propose a general framework for proving fingerprinting type lower bounds, that allows us to tailor the technique to the geometry of the query set.<n>We show that any (sample- and population-)accurate algorithm for answering $Q$ arbitrary adaptive counting queries over a universe $mathcalX$ to accuracy $alpha$ needs $Omega(fracsqrt log|mathcalX| log (1/delta) log Qvarepsilonalpha2)$ samples, matching known upper bounds up to constants
arXiv Detail & Related papers (2024-12-18T23:11:07Z)
Tight Bounds for Quantum Phase Estimation and Related Problems [0.90238471756546]
We show that a phase estimation algorithm with precision $delta$ and error probability $epsilon$ has cost $Omegaleft(frac1deltalog1epsilonright)$. We also show that having lots advice (applications of the advice-preparing unitary) does not significantly reduce cost, and neither does knowledge of the eigenbasis of $U$.
arXiv Detail & Related papers (2023-05-08T17:46:01Z)
\~Optimal Differentially Private Learning of Thresholds and Quasi-Concave Optimization [23.547605262139957]
The problem of learning threshold functions is a fundamental one in machine learning. We provide a nearly tight upper bound of $tildeO(log* |X|)1.5)$ by Kaplan et al. We also provide matching upper and lower bounds of $tildeTheta(2log*|X|)$ for the additive error of private quasi-concave (a related and more general problem)
arXiv Detail & Related papers (2022-11-11T18:16:46Z)
Superpolynomial Lower Bounds for Decision Tree Learning and Testing [5.117810469621253]
We prove, under randomized ETH, superpolynomial lower bounds for two basic problems. Our results imply new lower bounds for distribution-free PAC learning and testing of decision trees. We show our lower bound for learning decision trees can be improved to $nOmega(log s)$.
arXiv Detail & Related papers (2022-10-12T16:25:48Z)
Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z)
The Curse of Passive Data Collection in Batch Reinforcement Learning [82.6026077420886]
In high stake applications, active experimentation may be considered too risky and thus data are often collected passively. While in simple cases, such as in bandits, passive and active data collection are similarly effective, the price of passive sampling can be much higher when collecting data from a system with controlled states.
arXiv Detail & Related papers (2021-06-18T07:54:23Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Improved Sample Complexity for Incremental Autonomous Exploration in MDPs [132.88757893161699]
We learn the set of $epsilon$-optimal goal-conditioned policies attaining all states that are incrementally reachable within $L$ steps. DisCo is the first algorithm that can return an $epsilon/c_min$-optimal policy for any cost-sensitive shortest-path problem.
arXiv Detail & Related papers (2020-12-29T14:06:09Z)
$Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z)
Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy. For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
Private Learning of Halfspaces: Simplifying the Construction and Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$. The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z)
Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution. We show that the constraint of local differential privacy incurs an exponential increase in cost. Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z)
On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model. Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.