Polynomial Time and Private Learning of Unbounded Gaussian Mixture
Models
- URL: http://arxiv.org/abs/2303.04288v2
- Date: Wed, 7 Jun 2023 23:35:37 GMT
- Title: Polynomial Time and Private Learning of Unbounded Gaussian Mixture
Models
- Authors: Jamil Arbas, Hassan Ashtiani and Christopher Liaw
- Abstract summary: We study the problem of privately estimating the parameters of $d$-dimensional Gaussian Mixture Models (GMMs) with $k$ components.
We develop a technique to reduce the problem to its non-private counterpart.
We develop an $(varepsilon, delta)$-differentially private algorithm to learn GMMs using the non-private algorithm of Moitra Valiant and [MV10] as a blackbox.
- Score: 9.679150363410471
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We study the problem of privately estimating the parameters of
$d$-dimensional Gaussian Mixture Models (GMMs) with $k$ components. For this,
we develop a technique to reduce the problem to its non-private counterpart.
This allows us to privatize existing non-private algorithms in a blackbox
manner, while incurring only a small overhead in the sample complexity and
running time. As the main application of our framework, we develop an
$(\varepsilon, \delta)$-differentially private algorithm to learn GMMs using
the non-private algorithm of Moitra and Valiant [MV10] as a blackbox.
Consequently, this gives the first sample complexity upper bound and first
polynomial time algorithm for privately learning GMMs without any boundedness
assumptions on the parameters. As part of our analysis, we prove a tight (up to
a constant factor) lower bound on the total variation distance of
high-dimensional Gaussians which can be of independent interest.
Related papers
- An Efficient 1 Iteration Learning Algorithm for Gaussian Mixture Model
And Gaussian Mixture Embedding For Neural Network [2.261786383673667]
The new algorithm brings more robustness and simplicity than classic Expectation Maximization (EM) algorithm.
It also improves the accuracy and only take 1 iteration for learning.
arXiv Detail & Related papers (2023-08-18T10:17:59Z) - Fast Optimal Locally Private Mean Estimation via Random Projections [58.603579803010796]
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball.
We propose a new algorithmic framework, ProjUnit, for private mean estimation.
Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.
arXiv Detail & Related papers (2023-06-07T14:07:35Z) - Learning Hidden Markov Models Using Conditional Samples [72.20944611510198]
This paper is concerned with the computational complexity of learning the Hidden Markov Model (HMM)
In this paper, we consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.
Specifically, we obtain efficient algorithms for learning HMMs in settings where we have query access to the exact conditional probabilities.
arXiv Detail & Related papers (2023-02-28T16:53:41Z) - Private estimation algorithms for stochastic block models and mixture
models [63.07482515700984]
General tools for designing efficient private estimation algorithms.
First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z) - Scalable Differentially Private Clustering via Hierarchically Separated
Trees [82.69664595378869]
We show that our method computes a solution with cost at most $O(d3/2log n)cdot OPT + O(k d2 log2 n / epsilon2)$, where $epsilon$ is the privacy guarantee.
Although the worst-case guarantee is worse than that of state of the art private clustering methods, the algorithm we propose is practical.
arXiv Detail & Related papers (2022-06-17T09:24:41Z) - Efficient Mean Estimation with Pure Differential Privacy via a
Sum-of-Squares Exponential Mechanism [16.996435043565594]
We give the first-time algorithm to estimate the mean of a $d$-positive probability distribution with covariance from $tildeO(d)$ independent samples subject to pure differential privacy.
Our main technique is a new approach to use the powerful Sum of Squares method (SoS) to design differentially private algorithms.
arXiv Detail & Related papers (2021-11-25T09:31:15Z) - Robust Model Selection and Nearly-Proper Learning for GMMs [26.388358539260473]
In learning theory, a standard assumption is that the data is generated from a finite mixture model. But what happens when the number of components is not known in advance?
We are able to approximately determine the minimum number of components needed to fit the distribution within a logarithmic factor.
arXiv Detail & Related papers (2021-06-05T01:58:40Z) - Differentially Private ADMM Algorithms for Machine Learning [38.648113004535155]
We study efficient differentially private alternating direction methods of multipliers (ADMM) via gradient perturbation.
We propose the first differentially private ADMM (DP-ADMM) algorithm with performance guarantee of $(epsilon,delta)$-differential privacy.
arXiv Detail & Related papers (2020-10-31T01:37:24Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.