Private Convex Optimization via Exponential Mechanism
- URL: http://arxiv.org/abs/2203.00263v1
- Date: Tue, 1 Mar 2022 06:51:03 GMT
- Title: Private Convex Optimization via Exponential Mechanism
- Authors: Sivakanth Gopi, Yin Tat Lee, Daogao Liu
- Abstract summary: We show that modifying the exponential mechanism by adding an $ellcave2$ regularizer to $F(x)$ recovers both the known optimal empirical risk and population loss under $(epsilon,delta)$-DP.
We also show how to implement this mechanism using $widetildeO(n min(d, n)) queries to $f_i(x) for the DP-SCO where $n$ is the number of samples/optimal and $d is the ambient dimension.
- Score: 16.867534746193833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study private optimization problems for non-smooth convex
functions $F(x)=\mathbb{E}_i f_i(x)$ on $\mathbb{R}^d$. We show that modifying
the exponential mechanism by adding an $\ell_2^2$ regularizer to $F(x)$ and
sampling from $\pi(x)\propto \exp(-k(F(x)+\mu\|x\|_2^2/2))$ recovers both the
known optimal empirical risk and population loss under $(\epsilon,\delta)$-DP.
Furthermore, we show how to implement this mechanism using $\widetilde{O}(n
\min(d, n))$ queries to $f_i(x)$ for the DP-SCO where $n$ is the number of
samples/users and $d$ is the ambient dimension. We also give a (nearly)
matching lower bound $\widetilde{\Omega}(n \min(d, n))$ on the number of
evaluation queries.
Our results utilize the following tools that are of independent interest: (1)
We prove Gaussian Differential Privacy (GDP) of the exponential mechanism if
the loss function is strongly convex and the perturbation is Lipschitz. Our
privacy bound is \emph{optimal} as it includes the privacy of Gaussian
mechanism as a special case and is proved using the isoperimetric inequality
for strongly log-concave measures. (2) We show how to sample from
$\exp(-F(x)-\mu \|x\|^2_2/2)$ for $G$-Lipschitz $F$ with $\eta$ error in total
variation (TV) distance using $\widetilde{O}((G^2/\mu) \log^2(d/\eta))$
unbiased queries to $F(x)$. This is the first sampler whose query complexity
has \emph{polylogarithmic dependence} on both dimension $d$ and accuracy
$\eta$.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Estimation and Inference in Distributional Reinforcement Learning [28.253677740976197]
We show that a dataset of size $widetilde Oleft(frac|mathcalS||mathcalA|epsilon2 (1-gamma)4right)$ suffices to ensure the Kolmogorov metric and total variation metric between $hatetapi$ and $etapi$ is below $epsilon$ with high probability.
Our findings give rise to a unified approach to statistical inference of a wide class of statistical functionals of $etapi$.
arXiv Detail & Related papers (2023-09-29T14:14:53Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Unique Games hardness of Quantum Max-Cut, and a conjectured
vector-valued Borell's inequality [6.621324975749854]
We show that the noise stability of a function $f:mathbbRn to -1, 1$ is the expected value of $f(boldsymbolx) cdot f(boldsymboly)$.
We conjecture that the expected value of $langle f(boldsymbolx), f(boldsymboly)rangle$ is minimized by the function $f(x) = x_leq k / Vert x_leq k /
arXiv Detail & Related papers (2021-11-01T20:45:42Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Faster Rates of Differentially Private Stochastic Convex Optimization [7.93728520583825]
We study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $theta>1$.
In the second part, we focus on a special case where the population risk function is strongly convex.
arXiv Detail & Related papers (2021-07-31T22:23:39Z) - From Smooth Wasserstein Distance to Dual Sobolev Norm: Empirical
Approximation and Statistical Applications [18.618590805279187]
We show that $mathsfW_p(sigma)$ is controlled by a $pth order smooth dual Sobolev $mathsfd_p(sigma)$.
We derive the limit distribution of $sqrtnmathsfd_p(sigma)(hatmu_n,mu)$ in all dimensions $d$, when $mu$ is sub-Gaussian.
arXiv Detail & Related papers (2021-01-11T17:23:24Z) - Efficient Statistics for Sparse Graphical Models from Truncated Samples [19.205541380535397]
We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.
For sparse linear regression, suppose samples $(bf x,y)$ are generated where $y = bf xtopOmega* + mathcalN(0,1)$ and $(bf x, y)$ is seen only if $y$ belongs to a truncation set $S subseteq mathbbRd$.
arXiv Detail & Related papers (2020-06-17T09:21:00Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.