Learning Entangled Single-Sample Gaussians in the Subset-of-Signals
Model
- URL: http://arxiv.org/abs/2007.05557v1
- Date: Fri, 10 Jul 2020 18:25:38 GMT
- Title: Learning Entangled Single-Sample Gaussians in the Subset-of-Signals
Model
- Authors: Yingyu Liang and Hui Yuan
- Abstract summary: We study mean estimation for entangled single-sample Gaussians with a common mean but different unknown variances.
We show that the method achieves error $O left(fracsqrtnln nmright)$ with high probability when $m=Omega(sqrtnln n)$.
We further prove lower bounds, showing that the error is $Omegaleft(left(fracnm4right)1/6right)$ when $m$ is between $Omega(n
- Score: 28.839136703139225
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the setting of entangled single-sample distributions, the goal is to
estimate some common parameter shared by a family of $n$ distributions, given
one single sample from each distribution. This paper studies mean estimation
for entangled single-sample Gaussians that have a common mean but different
unknown variances. We propose the subset-of-signals model where an unknown
subset of $m$ variances are bounded by 1 while there are no assumptions on the
other variances. In this model, we analyze a simple and natural method based on
iteratively averaging the truncated samples, and show that the method achieves
error $O \left(\frac{\sqrt{n\ln n}}{m}\right)$ with high probability when
$m=\Omega(\sqrt{n\ln n})$, matching existing bounds for this range of $m$. We
further prove lower bounds, showing that the error is
$\Omega\left(\left(\frac{n}{m^4}\right)^{1/2}\right)$ when $m$ is between
$\Omega(\ln n)$ and $O(n^{1/4})$, and the error is
$\Omega\left(\left(\frac{n}{m^4}\right)^{1/6}\right)$ when $m$ is between
$\Omega(n^{1/4})$ and $O(n^{1 - \epsilon})$ for an arbitrarily small
$\epsilon>0$, improving existing lower bounds and extending to a wider range of
$m$.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan [49.1574468325115]
We revisit the sample and computational complexity of completing a rank-1 tensor in $otimes_i=1N mathbbRd$.
We present a characterization of the problem which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems.
arXiv Detail & Related papers (2024-08-10T04:26:19Z) - Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions [11.222970035173372]
kernel-based score estimator achieves an optimal mean square error of $widetildeOleft(n-1 t-fracd+22(tfracd2 vee 1)right)
We show that a kernel-based score estimator achieves an optimal mean square error of $widetildeOleft(n-1/2 t-fracd4right)$ upper bound for the total variation error of the distribution of the sample generated by the diffusion model under a mere sub-Gaussian
arXiv Detail & Related papers (2024-02-23T20:51:31Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - $L^1$ Estimation: On the Optimality of Linear Estimators [64.76492306585168]
This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian.
In particular, it is demonstrated that if the conditional distribution $P_X|Y=y$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution.
arXiv Detail & Related papers (2023-09-17T01:45:13Z) - Fast, Sample-Efficient, Affine-Invariant Private Mean and Covariance
Estimation for Subgaussian Distributions [8.40077201352607]
We present a fast, differentially private algorithm for high-dimensional covariance-aware mean estimation.
Our algorithm produces $tildemu$ such that $|mu|_Sigma leq alpha$ as long as $n gtrsim tfrac d alpha2 + tfracd sqrtlog 1/deltaalpha varepsilon+fracdlog 1/deltavarepsilon$.
arXiv Detail & Related papers (2023-01-28T16:57:46Z) - Structure Learning in Graphical Models from Indirect Observations [17.521712510832558]
This paper considers learning of the graphical structure of a $p$-dimensional random vector $X in Rp$ using both parametric and non-parametric methods.
Under mild conditions, we show that our graph-structure estimator can obtain the correct structure.
arXiv Detail & Related papers (2022-05-06T19:24:44Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Multivariate mean estimation with direction-dependent accuracy [8.147652597876862]
We consider the problem of estimating the mean of a random vector based on $N$ independent, identically distributed observations.
We prove an estimator that has a near-optimal error in all directions in which the variance of the one dimensional marginal of the random vector is not too small.
arXiv Detail & Related papers (2020-10-22T17:52:45Z) - Efficient Statistics for Sparse Graphical Models from Truncated Samples [19.205541380535397]
We focus on two fundamental and classical problems: (i) inference of sparse Gaussian graphical models and (ii) support recovery of sparse linear models.
For sparse linear regression, suppose samples $(bf x,y)$ are generated where $y = bf xtopOmega* + mathcalN(0,1)$ and $(bf x, y)$ is seen only if $y$ belongs to a truncation set $S subseteq mathbbRd$.
arXiv Detail & Related papers (2020-06-17T09:21:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.