On the Sample Complexity of Privately Learning Unbounded
High-Dimensional Gaussians
- URL: http://arxiv.org/abs/2010.09929v1
- Date: Mon, 19 Oct 2020 23:55:03 GMT
- Title: On the Sample Complexity of Privately Learning Unbounded
High-Dimensional Gaussians
- Authors: Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath
- Abstract summary: These are the first finite sample upper bounds for general Gaussians which do not impose restrictions on the parameters of the distribution.
From a technical standpoint, we provide analytic tools for arguing the existence of global "locally small" covers from local covers of the space.
Our techniques may prove useful for privately learning other distribution classes which do not possess a finite cover.
- Score: 13.517767928653443
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide sample complexity upper bounds for agnostically learning
multivariate Gaussians under the constraint of approximate differential
privacy. These are the first finite sample upper bounds for general Gaussians
which do not impose restrictions on the parameters of the distribution. Our
bounds are near-optimal in the case when the covariance is known to be the
identity, and conjectured to be near-optimal in the general case. From a
technical standpoint, we provide analytic tools for arguing the existence of
global "locally small" covers from local covers of the space. These are
exploited using modifications of recent techniques for differentially private
hypothesis selection. Our techniques may prove useful for privately learning
other distribution classes which do not possess a finite cover.
Related papers
- Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Generalized Schrödinger Bridge Matching [54.171931505066]
Generalized Schr"odinger Bridge (GSB) problem setup is prevalent in many scientific areas both within and without machine learning.
We propose Generalized Schr"odinger Bridge Matching (GSBM), a new matching algorithm inspired by recent advances.
We show that such a generalization can be cast as solving conditional optimal control, for which variational approximations can be used.
arXiv Detail & Related papers (2023-10-03T17:42:11Z) - A General Recipe for the Analysis of Randomized Multi-Armed Bandit Algorithms [14.33758865948252]
We revisit two famous bandit algorithms, Minimum Empirical Divergence (MED) and Thompson Sampling (TS)
We prove that MED is optimal for all these models, but also provide a simple regret analysis of some TS algorithms for which the optimality is already known.
arXiv Detail & Related papers (2023-03-10T16:43:48Z) - Simple Binary Hypothesis Testing under Local Differential Privacy and
Communication Constraints [8.261182037130407]
We study simple binary hypothesis testing under both local differential privacy (LDP) and communication constraints.
We qualify our results as either minimax optimal or instance optimal.
arXiv Detail & Related papers (2023-01-09T18:36:49Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Learning Algorithm Generalization Error Bounds via Auxiliary Distributions [16.44492672878356]
Generalization error bounds are essential for comprehending how well machine learning models work.
In this work, we suggest a novel method, i.e., the Auxiliary Distribution Method, that leads to new upper bounds on expected generalization errors.
arXiv Detail & Related papers (2022-10-02T10:37:04Z) - New Lower Bounds for Private Estimation and a Generalized Fingerprinting
Lemma [10.176795938619417]
We prove new lower bounds for statistical estimation tasks under the constraint of $(varepsilon, delta)$-differential privacy.
We show that estimating the Frobenius norm requires $Omega(d2)$ samples, and in spectral norm requires $Omega(d3/2)$ samples, both matching upper bounds up to logarithmic factors.
arXiv Detail & Related papers (2022-05-17T17:55:10Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Compressive Privatization: Sparse Distribution Estimation under Locally
Differentially Privacy [18.43218511751587]
We show that as long as the target distribution is sparse or approximately sparse, the number of samples needed could be significantly reduced.
Our mechanism does privatization and dimensionality reduction simultaneously, and the sample complexity will only depend on the reduced dimensionality.
arXiv Detail & Related papers (2020-12-03T17:14:23Z) - A Boundary Based Out-of-Distribution Classifier for Generalized
Zero-Shot Learning [83.1490247844899]
Generalized Zero-Shot Learning (GZSL) is a challenging topic that has promising prospects in many realistic scenarios.
We propose a boundary based Out-of-Distribution (OOD) classifier which classifies the unseen and seen domains by only using seen samples for training.
We extensively validate our approach on five popular benchmark datasets including AWA1, AWA2, CUB, FLO and SUN.
arXiv Detail & Related papers (2020-08-09T11:27:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.