The EM Algorithm is Adaptively-Optimal for Unbalanced Symmetric Gaussian
Mixtures
- URL: http://arxiv.org/abs/2103.15653v1
- Date: Mon, 29 Mar 2021 14:28:17 GMT
- Title: The EM Algorithm is Adaptively-Optimal for Unbalanced Symmetric Gaussian
Mixtures
- Authors: Nir Weinberger and Guy Bresler
- Abstract summary: We show that the EM algorithm adaptively achieves the minimax error rate $tildeOBig(minBigfrac1(1-2delta_*)sqrtfracdn,frac1|theta_*|sqrtfracdn,left(fracdnright)1/4BigBig)$ in no more than $OBig(frac1|theta_*|2Big
- Score: 36.91281862322494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the problem of estimating the means
$\pm\theta_{*}\in\mathbb{R}^{d}$ of a symmetric two-component Gaussian mixture
$\delta_{*}\cdot N(\theta_{*},I)+(1-\delta_{*})\cdot N(-\theta_{*},I)$ where
the weights $\delta_{*}$ and $1-\delta_{*}$ are unequal. Assuming that
$\delta_{*}$ is known, we show that the population version of the EM algorithm
globally converges if the initial estimate has non-negative inner product with
the mean of the larger weight component. This can be achieved by the trivial
initialization $\theta_{0}=0$. For the empirical iteration based on $n$
samples, we show that when initialized at $\theta_{0}=0$, the EM algorithm
adaptively achieves the minimax error rate
$\tilde{O}\Big(\min\Big\{\frac{1}{(1-2\delta_{*})}\sqrt{\frac{d}{n}},\frac{1}{\|\theta_{*}\|}\sqrt{\frac{d}{n}},\left(\frac{d}{n}\right)^{1/4}\Big\}\Big)$
in no more than $O\Big(\frac{1}{\|\theta_{*}\|(1-2\delta_{*})}\Big)$ iterations
(with high probability). We also consider the EM iteration for estimating the
weight $\delta_{*}$, assuming a fixed mean $\theta$ (which is possibly
mismatched to $\theta_{*}$). For the empirical iteration of $n$ samples, we
show that the minimax error rate
$\tilde{O}\Big(\frac{1}{\|\theta_{*}\|}\sqrt{\frac{d}{n}}\Big)$ is achieved in
no more than $O\Big(\frac{1}{\|\theta_{*}\|^{2}}\Big)$ iterations. These
results robustify and complement recent results of Wu and Zhou obtained for the
equal weights case $\delta_{*}=1/2$.
Related papers
- The Communication Complexity of Approximating Matrix Rank [50.6867896228563]
We show that this problem has randomized communication complexity $Omega(frac1kcdot n2log|mathbbF|)$.
As an application, we obtain an $Omega(frac1kcdot n2log|mathbbF|)$ space lower bound for any streaming algorithm with $k$ passes.
arXiv Detail & Related papers (2024-10-26T06:21:42Z) - Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan [49.1574468325115]
We revisit the sample and computational complexity of completing a rank-1 tensor in $otimes_i=1N mathbbRd$.
We present a characterization of the problem which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems.
arXiv Detail & Related papers (2024-08-10T04:26:19Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture
Models [12.746888269949407]
We consider a high-dimensional mean estimation problem over a binary hidden Markov model.
We establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of $|theta_*|,delta,d,n$.
arXiv Detail & Related papers (2022-06-06T09:34:04Z) - Matrix concentration inequalities and efficiency of random universal
sets of quantum gates [0.0]
For a random set $mathcalS subset U(d)$ of quantum gates we provide bounds on the probability that $mathcalS$ forms a $delta$-approximate $t$-design.
We show that for $mathcalS$ drawn from an exact $t$-design the probability that it forms a $delta$-approximate $t$-design satisfies the inequality $mathbbPleft(delta geq x right)leq 2D_t
arXiv Detail & Related papers (2022-02-10T23:44:09Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Spiked Covariance Estimation from Modulo-Reduced Measurements [14.569322713960494]
We develop and analyze an algorithm that, for most directions $bfu$ and $nu=mathrmpoly(k)$, estimates $bfu$ to high accuracy using $n=mathrmpoly(k)$ measurements.
Numerical experiments show that the developed algorithm performs well even in a non-asymptotic setting.
arXiv Detail & Related papers (2021-10-04T02:10:47Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Learning Mixtures of Spherical Gaussians via Fourier Analysis [0.5381004207943596]
We find that a bound on the sample and computational complexity was previously unknown when $omega(1) leq d leq O(log k)$.
These authors also show that the sample of complexity of a random mixture of gaussians in a ball of radius $d$ in $d$ dimensions, when $d$ is $Theta(sqrtd)$ in $d$ dimensions, when $d$ is at least $poly(k, frac1delta)$.
arXiv Detail & Related papers (2020-04-13T08:06:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.