The EM Algorithm is Adaptively-Optimal for Unbalanced Symmetric Gaussian
Mixtures
- URL: http://arxiv.org/abs/2103.15653v1
- Date: Mon, 29 Mar 2021 14:28:17 GMT
- Title: The EM Algorithm is Adaptively-Optimal for Unbalanced Symmetric Gaussian
Mixtures
- Authors: Nir Weinberger and Guy Bresler
- Abstract summary: We show that the EM algorithm adaptively achieves the minimax error rate $tildeOBig(minBigfrac1(1-2delta_*)sqrtfracdn,frac1|theta_*|sqrtfracdn,left(fracdnright)1/4BigBig)$ in no more than $OBig(frac1|theta_*|2Big
- Score: 36.91281862322494
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the problem of estimating the means
$\pm\theta_{*}\in\mathbb{R}^{d}$ of a symmetric two-component Gaussian mixture
$\delta_{*}\cdot N(\theta_{*},I)+(1-\delta_{*})\cdot N(-\theta_{*},I)$ where
the weights $\delta_{*}$ and $1-\delta_{*}$ are unequal. Assuming that
$\delta_{*}$ is known, we show that the population version of the EM algorithm
globally converges if the initial estimate has non-negative inner product with
the mean of the larger weight component. This can be achieved by the trivial
initialization $\theta_{0}=0$. For the empirical iteration based on $n$
samples, we show that when initialized at $\theta_{0}=0$, the EM algorithm
adaptively achieves the minimax error rate
$\tilde{O}\Big(\min\Big\{\frac{1}{(1-2\delta_{*})}\sqrt{\frac{d}{n}},\frac{1}{\|\theta_{*}\|}\sqrt{\frac{d}{n}},\left(\frac{d}{n}\right)^{1/4}\Big\}\Big)$
in no more than $O\Big(\frac{1}{\|\theta_{*}\|(1-2\delta_{*})}\Big)$ iterations
(with high probability). We also consider the EM iteration for estimating the
weight $\delta_{*}$, assuming a fixed mean $\theta$ (which is possibly
mismatched to $\theta_{*}$). For the empirical iteration of $n$ samples, we
show that the minimax error rate
$\tilde{O}\Big(\frac{1}{\|\theta_{*}\|}\sqrt{\frac{d}{n}}\Big)$ is achieved in
no more than $O\Big(\frac{1}{\|\theta_{*}\|^{2}}\Big)$ iterations. These
results robustify and complement recent results of Wu and Zhou obtained for the
equal weights case $\delta_{*}=1/2$.
Related papers
- Efficient Continual Finite-Sum Minimization [52.5238287567572]
We propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization.
Our approach significantly improves upon the $mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$ requires.
We also prove that there is no natural first-order method with $mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$, establishing that the first-order complexity of our method is nearly tight.
arXiv Detail & Related papers (2024-06-07T08:26:31Z) - Spectral Statistics of the Sample Covariance Matrix for High Dimensional
Linear Gaussians [12.524855369455421]
Performance of ordinary least squares(OLS) method for the emphestimation of high dimensional stable state transition matrix.
OLS estimator incurs a emphphase transition and becomes emphtransient: increasing only worsens estimation error.
arXiv Detail & Related papers (2023-12-10T06:55:37Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture
Models [12.746888269949407]
We consider a high-dimensional mean estimation problem over a binary hidden Markov model.
We establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of $|theta_*|,delta,d,n$.
arXiv Detail & Related papers (2022-06-06T09:34:04Z) - Matrix concentration inequalities and efficiency of random universal
sets of quantum gates [0.0]
For a random set $mathcalS subset U(d)$ of quantum gates we provide bounds on the probability that $mathcalS$ forms a $delta$-approximate $t$-design.
We show that for $mathcalS$ drawn from an exact $t$-design the probability that it forms a $delta$-approximate $t$-design satisfies the inequality $mathbbPleft(delta geq x right)leq 2D_t
arXiv Detail & Related papers (2022-02-10T23:44:09Z) - Active Sampling for Linear Regression Beyond the $\ell_2$ Norm [70.49273459706546]
We study active sampling algorithms for linear regression, which aim to query only a small number of entries of a target vector.
We show that this dependence on $d$ is optimal, up to logarithmic factors.
We also provide the first total sensitivity upper bound $O(dmax1,p/2log2 n)$ for loss functions with at most degree $p$ growth.
arXiv Detail & Related papers (2021-11-09T00:20:01Z) - Spiked Covariance Estimation from Modulo-Reduced Measurements [14.569322713960494]
We develop and analyze an algorithm that, for most directions $bfu$ and $nu=mathrmpoly(k)$, estimates $bfu$ to high accuracy using $n=mathrmpoly(k)$ measurements.
Numerical experiments show that the developed algorithm performs well even in a non-asymptotic setting.
arXiv Detail & Related papers (2021-10-04T02:10:47Z) - Optimal Mean Estimation without a Variance [103.26777953032537]
We study the problem of heavy-tailed mean estimation in settings where the variance of the data-generating distribution does not exist.
We design an estimator which attains the smallest possible confidence interval as a function of $n,d,delta$.
arXiv Detail & Related papers (2020-11-24T22:39:21Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - Learning Mixtures of Spherical Gaussians via Fourier Analysis [0.5381004207943596]
We find that a bound on the sample and computational complexity was previously unknown when $omega(1) leq d leq O(log k)$.
These authors also show that the sample of complexity of a random mixture of gaussians in a ball of radius $d$ in $d$ dimensions, when $d$ is $Theta(sqrtd)$ in $d$ dimensions, when $d$ is at least $poly(k, frac1delta)$.
arXiv Detail & Related papers (2020-04-13T08:06:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.