Sample Complexity Bounds for Learning High-dimensional Simplices in
Noisy Regimes
- URL: http://arxiv.org/abs/2209.05953v2
- Date: Sat, 29 Apr 2023 02:17:39 GMT
- Title: Sample Complexity Bounds for Learning High-dimensional Simplices in
Noisy Regimes
- Authors: Amir Hossein Saberi, Amir Najafi, Seyed Abolfazl Motahari and Babak H.
Khalaj
- Abstract summary: We find a sample complexity bound for learning a simplex from noisy samples.
We show that as long as $mathrmSNRgeOmegaleft(K1/2right)$, the sample complexity of the noisy regime has the same order to that of the noiseless case.
- Score: 5.526935605535376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we find a sample complexity bound for learning a simplex from
noisy samples. Assume a dataset of size $n$ is given which includes i.i.d.
samples drawn from a uniform distribution over an unknown simplex in
$\mathbb{R}^K$, where samples are assumed to be corrupted by a multi-variate
additive Gaussian noise of an arbitrary magnitude. We prove the existence of an
algorithm that with high probability outputs a simplex having a $\ell_2$
distance of at most $\varepsilon$ from the true simplex (for any
$\varepsilon>0$). Also, we theoretically show that in order to achieve this
bound, it is sufficient to have
$n\ge\left(K^2/\varepsilon^2\right)e^{\Omega\left(K/\mathrm{SNR}^2\right)}$
samples, where $\mathrm{SNR}$ stands for the signal-to-noise ratio. This result
solves an important open problem and shows as long as
$\mathrm{SNR}\ge\Omega\left(K^{1/2}\right)$, the sample complexity of the noisy
regime has the same order to that of the noiseless case. Our proofs are a
combination of the so-called sample compression technique in
\citep{ashtiani2018nearly}, mathematical tools from high-dimensional geometry,
and Fourier analysis. In particular, we have proposed a general Fourier-based
technique for recovery of a more general class of distribution families from
additive Gaussian noise, which can be further used in a variety of other
related problems.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan [49.1574468325115]
We revisit the sample and computational complexity of completing a rank-1 tensor in $otimes_i=1N mathbbRd$.
We present a characterization of the problem which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems.
arXiv Detail & Related papers (2024-08-10T04:26:19Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z) - Random quantum circuits transform local noise into global white noise [118.18170052022323]
We study the distribution over measurement outcomes of noisy random quantum circuits in the low-fidelity regime.
For local noise that is sufficiently weak and unital, correlations (measured by the linear cross-entropy benchmark) between the output distribution $p_textnoisy$ of a generic noisy circuit instance shrink exponentially.
If the noise is incoherent, the output distribution approaches the uniform distribution $p_textunif$ at precisely the same rate.
arXiv Detail & Related papers (2021-11-29T19:26:28Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Estimation of Shortest Path Covariance Matrices [21.772761365915176]
We study the sample complexity of estimating the covariance matrix $mathbfSigma in mathbbRdtimes d$ of a distribution $mathcal D$ over $mathbbRd$ given independent samples.
We give a very simple algorithm for estimating $mathbfSigma$ up to norm error $epsilon left|mathbfSigmaright|$ using just $O(sqrtD)$ entry complexity and $tilde O(r
arXiv Detail & Related papers (2020-11-19T17:37:46Z) - Multi-reference alignment in high dimensions: sample complexity and
phase transition [31.841289319809814]
Multi-reference alignment entails estimating a signal in $mathbbRL$ from its circularly-shifted and noisy copies.
Motivated by single-particle cryo-electron microscopy, we analyze the sample complexity of the problem in the high-dimensional regime.
Our analysis uncovers a phase transition phenomenon governed by the parameter $alpha = L/(sigma2log L)$, where $sigma2$ is the variance of the noise.
arXiv Detail & Related papers (2020-07-22T15:04:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.