Optimal estimation of high-dimensional location Gaussian mixtures
- URL: http://arxiv.org/abs/2002.05818v2
- Date: Mon, 26 Jul 2021 00:49:14 GMT
- Title: Optimal estimation of high-dimensional location Gaussian mixtures
- Authors: Natalie Doss and Yihong Wu and Pengkun Yang and Harrison H. Zhou
- Abstract summary: We show that the minimax rate of estimating the mixing distribution in Wasserstein distance is $Theta((d/n)1/4 + n-1/(4k-2))$.
We also show that the mixture density can be estimated at the optimal parametric rate $Theta(sqrtd/n)$ in Hellinger distance.
- Score: 6.947889260024788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the optimal rate of estimation in a finite Gaussian
location mixture model in high dimensions without separation conditions. We
assume that the number of components $k$ is bounded and that the centers lie in
a ball of bounded radius, while allowing the dimension $d$ to be as large as
the sample size $n$. Extending the one-dimensional result of Heinrich and Kahn
\cite{HK2015}, we show that the minimax rate of estimating the mixing
distribution in Wasserstein distance is $\Theta((d/n)^{1/4} + n^{-1/(4k-2)})$,
achieved by an estimator computable in time $O(nd^2+n^{5/4})$. Furthermore, we
show that the mixture density can be estimated at the optimal parametric rate
$\Theta(\sqrt{d/n})$ in Hellinger distance and provide a computationally
efficient algorithm to achieve this rate in the special case of $k=2$.
Both the theoretical and methodological development rely on a careful
application of the method of moments. Central to our results is the observation
that the information geometry of finite Gaussian mixtures is characterized by
the moment tensors of the mixing distribution, whose low-rank structure can be
exploited to obtain a sharp local entropy bound.
Related papers
- Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent [14.890609936348277]
We provide finite-particle convergence rates for the Stein Variational Gradient Descent algorithm in the Kernelized Stein Discrepancy ($mathsfKSD$) and Wasserstein-2 metrics.
Our key insight is that the time derivative of the relative entropy between the joint density of $N$ particle locations splits into a dominant negative part' proportional to $N$ times the expected $mathsfKSD2$ and a smaller positive part'
arXiv Detail & Related papers (2024-09-13T01:49:19Z) - Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$)
We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Optimal score estimation via empirical Bayes smoothing [13.685846094715364]
We study the problem of estimating the score function of an unknown probability distribution $rho*$ from $n$ independent and identically distributed observations in $d$ dimensions.
We show that a regularized score estimator based on a Gaussian kernel attains this rate, shown optimal by a matching minimax lower bound.
arXiv Detail & Related papers (2024-02-12T16:17:40Z) - Efficient Estimation of the Central Mean Subspace via Smoothed Gradient Outer Products [12.047053875716506]
We consider the problem of sufficient dimension reduction for multi-index models.
We show that a fast parametric convergence rate of form $C_d cdot n-1/2$ is achievable.
arXiv Detail & Related papers (2023-12-24T12:28:07Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - A New Framework for Variance-Reduced Hamiltonian Monte Carlo [88.84622104944503]
We propose a new framework of variance-reduced Hamiltonian Monte Carlo (HMC) methods for sampling from an $L$-smooth and $m$-strongly log-concave distribution.
We show that the unbiased gradient estimators, including SAGA and SVRG, based HMC methods achieve highest gradient efficiency with small batch size.
Experimental results on both synthetic and real-world benchmark data show that our new framework significantly outperforms the full gradient and gradient HMC approaches.
arXiv Detail & Related papers (2021-02-09T02:44:24Z) - Fermionic partial tomography via classical shadows [0.0]
We propose a tomographic protocol for estimating any $ k $-body reduced density matrix ($ k $-RDM) of an $ n $-mode fermionic state.
Our approach extends the framework of classical shadows, a randomized approach to learning a collection of quantum-state properties, to the fermionic setting.
arXiv Detail & Related papers (2020-10-30T06:28:26Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - Spectral density estimation with the Gaussian Integral Transform [91.3755431537592]
spectral density operator $hatrho(omega)=delta(omega-hatH)$ plays a central role in linear response theory.
We describe a near optimal quantum algorithm providing an approximation to the spectral density.
arXiv Detail & Related papers (2020-04-10T03:14:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.