Pair Correlation Factor and the Sample Complexity of Gaussian Mixtures
- URL: http://arxiv.org/abs/2508.03633v1
- Date: Tue, 05 Aug 2025 16:50:33 GMT
- Title: Pair Correlation Factor and the Sample Complexity of Gaussian Mixtures
- Authors: Farzad Aryan,
- Abstract summary: We introduce the emphPair Correlation Factor (PCF), a geometric quantity capturing the clustering of component means.<n>Unlike the minimum gap, the PCF more accurately dictates the difficulty of parameter recovery.<n>In the uniform spherical case, we give an algorithm with improved sample complexity bounds, showing when more than the usual $epsilon-2$ samples are necessary.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of learning Gaussian Mixture Models (GMMs) and ask: which structural properties govern their sample complexity? Prior work has largely tied this complexity to the minimum pairwise separation between components, but we demonstrate this view is incomplete. We introduce the \emph{Pair Correlation Factor} (PCF), a geometric quantity capturing the clustering of component means. Unlike the minimum gap, the PCF more accurately dictates the difficulty of parameter recovery. In the uniform spherical case, we give an algorithm with improved sample complexity bounds, showing when more than the usual $\epsilon^{-2}$ samples are necessary.
Related papers
- Robust Learnability of Sample-Compressible Distributions under Noisy or Adversarial Perturbations [0.723486289593299]
In 2018, Ashtiani et al. reframed emphsample compressibility, originally due to Littlestone and Warmuth (1986), as a structural property of distribution classes.<n>We establish that sample compressible families remain learnable even from perturbed samples, subject to a set of necessary and sufficient conditions.
arXiv Detail & Related papers (2025-06-07T01:11:50Z) - Nearly Optimal Sample Complexity for Learning with Label Proportions [54.67830198790247]
We investigate Learning from Label Proportions (LLP), a partial information setting where examples in a training set are grouped into bags.<n>Despite the partial observability, the goal is still to achieve small regret at the level of individual examples.<n>We give results on the sample complexity of LLP under square loss, showing that our sample complexity is essentially optimal.
arXiv Detail & Related papers (2025-05-08T15:45:23Z) - Do we really need the Rademacher complexities? [3.683202928838613]
We study the fundamental problem of learning with respect to the squared loss in a convex class.<n>We prove that, contrary to prevailing belief and under minimal assumptions, the sample complexity is not governed by the Rademacher complexities.
arXiv Detail & Related papers (2025-02-21T00:40:22Z) - Information-Computation Tradeoffs for Learning Margin Halfspaces with
Random Classification Noise [50.64137465792738]
We study the problem of PAC $gamma$-margin halfspaces with Random Classification Noise.
We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms.
arXiv Detail & Related papers (2023-06-28T16:33:39Z) - Sample Complexity of Probability Divergences under Group Symmetry [13.084804346845816]
We rigorously quantify the improvement in the sample complexity of variational divergence estimations for group-invariant distributions.
In the cases of the Wasserstein-1 metric and the Lipschitz-regularized $alpha$-divergences, the reduction of sample complexity is proportional to the group size if the group is finite.
arXiv Detail & Related papers (2023-02-03T18:50:15Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Forster Decomposition and Learning Halfspaces with Noise [60.691817861402676]
A Forster transform is an operation that turns a distribution into one with good anti-concentration properties.
We show that any distribution can be decomposed efficiently as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently.
arXiv Detail & Related papers (2021-07-12T17:00:59Z) - The Sample Complexity of Distribution-Free Parity Learning in the Robust
Shuffle Model [11.821892196198457]
We show that the sample complexity of learning $d$-bit parity functions is $Omega (2d/2)$.
We also sketch a simple shuffle model protocol demonstrating that our results are tight up to $poly(d)$ factors.
arXiv Detail & Related papers (2021-03-29T15:26:02Z) - Revisiting the Sample Complexity of Sparse Spectrum Approximation of
Gaussian Processes [60.479499225746295]
We introduce a new scalable approximation for Gaussian processes with provable guarantees which hold simultaneously over its entire parameter space.
Our approximation is obtained from an improved sample complexity analysis for sparse spectrum Gaussian processes (SSGPs)
arXiv Detail & Related papers (2020-11-17T05:41:50Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - The EM Algorithm gives Sample-Optimality for Learning Mixtures of
Well-Separated Gaussians [21.780375994511324]
We prove a new result for the ExpectationMaximization (EM): we show that EM converges locally, under separation $Omega(sqrtlog k)$.
Our results do not assume or use prior knowledge of the (potentially different) Gaussian components.
arXiv Detail & Related papers (2020-02-02T05:09:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.