Robust Model Selection and Nearly-Proper Learning for GMMs
- URL: http://arxiv.org/abs/2106.02774v2
- Date: Sun, 23 Apr 2023 02:07:39 GMT
- Title: Robust Model Selection and Nearly-Proper Learning for GMMs
- Authors: Jerry Li, Allen Liu, Ankur Moitra
- Abstract summary: In learning theory, a standard assumption is that the data is generated from a finite mixture model. But what happens when the number of components is not known in advance?
We are able to approximately determine the minimum number of components needed to fit the distribution within a logarithmic factor.
- Score: 26.388358539260473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In learning theory, a standard assumption is that the data is generated from
a finite mixture model. But what happens when the number of components is not
known in advance? The problem of estimating the number of components, also
called model selection, is important in its own right but there are essentially
no known efficient algorithms with provable guarantees let alone ones that can
tolerate adversarial corruptions. In this work, we study the problem of robust
model selection for univariate Gaussian mixture models (GMMs). Given
$\textsf{poly}(k/\epsilon)$ samples from a distribution that is
$\epsilon$-close in TV distance to a GMM with $k$ components, we can construct
a GMM with $\widetilde{O}(k)$ components that approximates the distribution to
within $\widetilde{O}(\epsilon)$ in $\textsf{poly}(k/\epsilon)$ time. Thus we
are able to approximately determine the minimum number of components needed to
fit the distribution within a logarithmic factor. Prior to our work, the only
known algorithms for learning arbitrary univariate GMMs either output
significantly more than $k$ components (e.g. $k/\epsilon^2$ components for
kernel density estimates) or run in time exponential in $k$. Moreover, by
adapting our techniques we obtain similar results for reconstructing
Fourier-sparse signals.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution.
We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Sparse Gaussian Graphical Models with Discrete Optimization:
Computational and Statistical Perspectives [8.403841349300103]
We consider the problem of learning a sparse graph underlying an undirected Gaussian graphical model.
We propose GraphL0BnB, a new estimator based on an $ell_0$-penalized version of the pseudolikelihood function.
Our numerical experiments on real/synthetic datasets suggest that our method can solve, to near-optimality, problem instances with $p = 104$.
arXiv Detail & Related papers (2023-07-18T15:49:02Z) - A Spectral Algorithm for List-Decodable Covariance Estimation in
Relative Frobenius Norm [41.03423042792559]
We produce a list of hypotheses that are close to $Sigma$ in relative Frobenius norm.
As a corollary, we obtain an efficient spectral algorithm for robust partial clustering of Gaussian mixture models.
Our new method yields the first Sum-of-Squares-free algorithm for robustly learning GMMs.
arXiv Detail & Related papers (2023-05-01T17:54:35Z) - A Fourier Approach to Mixture Learning [46.995354373649675]
We give an algorithm that efficiently learns the means in $d = O(log k/loglog k)$ dimensions under separation $d/sqrtlog k (modulo factors)
Our results can be easily extended to learning mixtures of non-Gaussian distributions, under a mild condition on the Fourier spectrum of the distribution.
arXiv Detail & Related papers (2022-10-05T17:35:46Z) - Near-optimal fitting of ellipsoids to random points [68.12685213894112]
A basic problem of fitting an ellipsoid to random points has connections to low-rank matrix decompositions, independent component analysis, and principal component analysis.
We resolve this conjecture up to logarithmic factors by constructing a fitting ellipsoid for some $n = Omega(, d2/mathrmpolylog(d),)$.
Our proof demonstrates feasibility of the least squares construction of Saunderson et al. using a convenient decomposition of a certain non-standard random matrix.
arXiv Detail & Related papers (2022-08-19T18:00:34Z) - Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z) - List-Decodable Mean Estimation in Nearly-PCA Time [50.79691056481693]
We study the fundamental task of list-decodable mean estimation in high dimensions.
Our algorithm runs in time $widetildeO(ndk)$ for all $k = O(sqrtd) cup Omega(d)$, where $n$ is the size of the dataset.
A variant of our algorithm has runtime $widetildeO(ndk)$ for all $k$, at the expense of an $O(sqrtlog k)$ factor in the recovery guarantee
arXiv Detail & Related papers (2020-11-19T17:21:37Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.