Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures
- URL: http://arxiv.org/abs/2203.15150v3
- Date: Tue, 4 Jul 2023 17:26:07 GMT
- Title: Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures
- Authors: Bryon Aragam, Wai Ming Tai
- Abstract summary: We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
- Score: 9.053430799456587
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning nonparametric distributions in a finite
mixture, and establish tight bounds on the sample complexity for learning the
component distributions in such models. Namely, we are given i.i.d. samples
from a pdf $f$ where $$ f=w_1f_1+w_2f_2, \quad w_1+w_2=1, \quad w_1,w_2>0 $$
and we are interested in learning each component $f_i$. Without any assumptions
on $f_i$, this problem is ill-posed. In order to identify the components $f_i$,
we assume that each $f_i$ can be written as a convolution of a Gaussian and a
compactly supported density $\nu_i$ with $\text{supp}(\nu_1)\cap
\text{supp}(\nu_2)=\emptyset$.
Our main result shows that $(\frac{1}{\varepsilon})^{\Omega(\log\log
\frac{1}{\varepsilon})}$ samples are required for estimating each $f_i$. The
proof relies on a quantitative Tauberian theorem that yields a fast rate of
approximation with Gaussians, which may be of independent interest. To show
this is tight, we also propose an algorithm that uses
$(\frac{1}{\varepsilon})^{O(\log\log \frac{1}{\varepsilon})}$ samples to
estimate each $f_i$. Unlike existing approaches to learning latent variable
models based on moment-matching and tensor methods, our proof instead involves
a delicate analysis of an ill-conditioned linear system via orthogonal
functions. Combining these bounds, we conclude that the optimal sample
complexity of this problem properly lies in between polynomial and exponential,
which is not common in learning theory.
Related papers
- A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - A Fourier Approach to Mixture Learning [46.995354373649675]
We give an algorithm that efficiently learns the means in $d = O(log k/loglog k)$ dimensions under separation $d/sqrtlog k (modulo factors)
Our results can be easily extended to learning mixtures of non-Gaussian distributions, under a mild condition on the Fourier spectrum of the distribution.
arXiv Detail & Related papers (2022-10-05T17:35:46Z) - Sample Complexity Bounds for Learning High-dimensional Simplices in
Noisy Regimes [5.526935605535376]
We find a sample complexity bound for learning a simplex from noisy samples.
We show that as long as $mathrmSNRgeOmegaleft(K1/2right)$, the sample complexity of the noisy regime has the same order to that of the noiseless case.
arXiv Detail & Related papers (2022-09-09T23:35:25Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Statistically Near-Optimal Hypothesis Selection [33.83129262033921]
We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem.
This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity.
arXiv Detail & Related papers (2021-08-17T21:11:20Z) - Structured Logconcave Sampling with a Restricted Gaussian Oracle [23.781520510778716]
We give algorithms for sampling several structured logconcave families to high accuracy.
We further develop a reduction framework, inspired by proximal point methods in convex optimization.
arXiv Detail & Related papers (2020-10-07T01:43:07Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.