On the Query Complexity of Training Data Reconstruction in Private
Learning
- URL: http://arxiv.org/abs/2303.16372v6
- Date: Thu, 11 Jan 2024 22:50:56 GMT
- Title: On the Query Complexity of Training Data Reconstruction in Private
Learning
- Authors: Prateeti Mukherjee and Satya Lokam
- Abstract summary: We analyze the number of queries that a whitebox adversary needs to make to a private learner in order to reconstruct its training data.
For $(epsilon, delta)$ DP learners with training data drawn from any arbitrary compact metric space, we provide the emphfirst known lower bounds on the adversary's query complexity.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We analyze the number of queries that a whitebox adversary needs to make to a
private learner in order to reconstruct its training data. For $(\epsilon,
\delta)$ DP learners with training data drawn from any arbitrary compact metric
space, we provide the \emph{first known lower bounds on the adversary's query
complexity} as a function of the learner's privacy parameters. \emph{Our
results are minimax optimal for every $\epsilon \geq 0, \delta \in [0, 1]$,
covering both $\epsilon$-DP and $(0, \delta)$ DP as corollaries}. Beyond this,
we obtain query complexity lower bounds for $(\alpha, \epsilon)$ R\'enyi DP
learners that are valid for any $\alpha > 1, \epsilon \geq 0$. Finally, we
analyze data reconstruction attacks on locally compact metric spaces via the
framework of Metric DP, a generalization of DP that accounts for the underlying
metric structure of the data. In this setting, we provide the first known
analysis of data reconstruction in unbounded, high dimensional spaces and
obtain query complexity lower bounds that are nearly tight modulo logarithmic
factors.
Related papers
- Nearly Optimal Differentially Private ReLU Regression [18.599299269974498]
We investigate one of the most fundamental non learning problems, ReLU regression, in the Differential Privacy (DP) model.
We show that it is possible to achieve an upper bound of $TildeO(fracd2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 varepsilon2N2 vareps
arXiv Detail & Related papers (2025-03-08T02:09:47Z) - On the Dichotomy Between Privacy and Traceability in $\ell_p$ Stochastic Convex Optimization [34.23960368886818]
We investigate the necessity of memorization in convex optimization (SCO) under $ell_p$ geometries.
Our main results uncover a fundamental tradeoff between traceability and excess risk in SCO.
arXiv Detail & Related papers (2025-02-24T18:10:06Z) - Learning Locally Adaptive Metrics that Enhance Structural Representation with $\texttt{LAMINAR}$ [0.0]
$textttLAMINAR$ is an unsupervised machine learning pipeline designed to enhance the representation of structure within data.
It produces a locally-adaptive-metric that produces structurally-informative density-based distances.
We demonstrate the utility of $textttLAMINAR$ by comparing its output to the Euclidean metric for structured data sets.
arXiv Detail & Related papers (2024-11-13T12:13:15Z) - Private Mean Estimation with Person-Level Differential Privacy [6.621676316292624]
We study person-level differentially private mean estimation in the case where each person holds multiple samples.
We give computationally efficient algorithms under approximate-DP and computationally inefficient algorithms under pure DP, and our nearly matching lower bounds hold for the most permissive case of approximate DP.
arXiv Detail & Related papers (2024-05-30T18:20:35Z) - Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators.
Key to our solution is a novel projection technique based on ideas from harmonic analysis.
Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z) - A Combinatorial Approach to Robust PCA [18.740048806623037]
We study the problem of recovering Gaussian data under adversarial corruptions.
We assume that the Gaussian noises lie in an unknown $k$-dimensional subspace $U subseteq mathbbRd$, and $s$ randomly chosen coordinates of each data point fall into the control of an adversary.
Our main result is an efficient algorithm that, when $ks2 = O(d)$, recovers every single data point up to a nearly-optimal error of $tilde O(ks/d)$ in expectation.
arXiv Detail & Related papers (2023-11-28T01:49:51Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Differentially Private Exploration in Reinforcement Learning with Linear
Representation [102.17246636801649]
We first consider the setting of linear-mixture MDPs (Ayoub et al., 2020) (a.k.a. model-based setting) and provide an unified framework for analyzing joint and local differential private (DP) exploration.
We further study privacy-preserving exploration in linear MDPs (Jin et al., 2020) (a.k.a. model-free setting) where we provide a $widetildeO(sqrtK/epsilon)$ regret bound for $(epsilon,delta)
arXiv Detail & Related papers (2021-12-02T19:59:50Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Gradient-Based Empirical Risk Minimization using Local Polynomial
Regression [39.29885444997579]
A major goal of this literature has been to compare different algorithms, such as gradient descent (GD) or gradient descent (SGD)
We demonstrate that when the loss function is smooth in the data, we can learn the oracle at every iteration and beat the oracle complexities of both GD and SGD.
arXiv Detail & Related papers (2020-11-04T20:10:31Z) - Average-case Complexity of Teaching Convex Polytopes via Halfspace
Queries [55.28642461328172]
We show that the average-case teaching complexity is $Theta(d)$, which is in sharp contrast to the worst-case teaching complexity of $Theta(n)$.
Our insights allow us to establish a tight bound on the average-case complexity for $phi$-separable dichotomies.
arXiv Detail & Related papers (2020-06-25T19:59:24Z) - Private Learning of Halfspaces: Simplifying the Construction and
Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$.
The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.