Fundamental Limits of Learning High-dimensional Simplices in Noisy Regimes
- URL: http://arxiv.org/abs/2506.10101v1
- Date: Wed, 11 Jun 2025 18:35:38 GMT
- Title: Fundamental Limits of Learning High-dimensional Simplices in Noisy Regimes
- Authors: Seyed Amir Hossein Saberi, Amir Najafi, Abolfazl Motahari, Babak H. khalaj,
- Abstract summary: We prove an algorithm exists that, with high probability, outputs a simplex within $ell$ or total variation (TV) distance at most $varepsilon$ from the true simplex.<n>In the noiseless scenario, our lower bound $n ge Omega(K/varepsilon)$ matches known upper bounds up to constant factors.
- Score: 0.5892638927736114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we establish sample complexity bounds for learning high-dimensional simplices in $\mathbb{R}^K$ from noisy data. Specifically, we consider $n$ i.i.d. samples uniformly drawn from an unknown simplex in $\mathbb{R}^K$, each corrupted by additive Gaussian noise of unknown variance. We prove an algorithm exists that, with high probability, outputs a simplex within $\ell_2$ or total variation (TV) distance at most $\varepsilon$ from the true simplex, provided $n \ge (K^2/\varepsilon^2) e^{\mathcal{O}(K/\mathrm{SNR}^2)}$, where $\mathrm{SNR}$ is the signal-to-noise ratio. Extending our prior work~\citep{saberi2023sample}, we derive new information-theoretic lower bounds, showing that simplex estimation within TV distance $\varepsilon$ requires at least $n \ge \Omega(K^3 \sigma^2/\varepsilon^2 + K/\varepsilon)$ samples, where $\sigma^2$ denotes the noise variance. In the noiseless scenario, our lower bound $n \ge \Omega(K/\varepsilon)$ matches known upper bounds up to constant factors. We resolve an open question by demonstrating that when $\mathrm{SNR} \ge \Omega(K^{1/2})$, noisy-case complexity aligns with the noiseless case. Our analysis leverages sample compression techniques (Ashtiani et al., 2018) and introduces a novel Fourier-based method for recovering distributions from noisy observations, potentially applicable beyond simplex learning.
Related papers
- Mean and Variance Estimation Complexity in Arbitrary Distributions via Wasserstein Minimization [0.0]
This paper focuses on the complexity of estimating translation translation $boldsymbolmu in mathbbRl$ and shrinkage $sigma in mathbbR_++$ parameters.<n>We highlight that while the problem is NP-hard for Maximum Likelihood Estimation (MLE), it is possible to obtain $varepsilon$-approxs for arbitrary $varepsilon > 0$ within $textpoly left( frac1varepsilon )$ time using the
arXiv Detail & Related papers (2025-01-17T13:07:52Z) - Sample and Computationally Efficient Robust Learning of Gaussian Single-Index Models [37.42736399673992]
A single-index model (SIM) is a function of the form $sigma(mathbfwast cdot mathbfx)$, where $sigma: mathbbR to mathbbR$ is a known link function and $mathbfwast$ is a hidden unit vector.
We show that a proper learner attains $L2$-error of $O(mathrmOPT)+epsilon$, where $
arXiv Detail & Related papers (2024-11-08T17:10:38Z) - Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$<n>We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Sample Complexity Bounds for Learning High-dimensional Simplices in
Noisy Regimes [5.526935605535376]
We find a sample complexity bound for learning a simplex from noisy samples.
We show that as long as $mathrmSNRgeOmegaleft(K1/2right)$, the sample complexity of the noisy regime has the same order to that of the noiseless case.
arXiv Detail & Related papers (2022-09-09T23:35:25Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Memory-Sample Lower Bounds for Learning Parity with Noise [2.724141845301679]
We show, for the well-studied problem of learning parity under noise, that any learning algorithm requires either a memory of size $Omega(n2/varepsilon)$ or an exponential number of samples.
Our proof is based on adapting the arguments in [Raz'17,GRT'18] to the noisy case.
arXiv Detail & Related papers (2021-07-05T23:34:39Z) - Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication
Time [14.990725929840892]
We show an algorithm that runs in time $widetildeO(T(N, d) log kappa / mathrmpoly (varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d times N$ matrix by its transpose.
Our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors.
arXiv Detail & Related papers (2020-06-23T20:21:27Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.