Estimating the Number of Components in Finite Mixture Models via the
Group-Sort-Fuse Procedure
- URL: http://arxiv.org/abs/2005.11641v2
- Date: Wed, 4 Aug 2021 22:43:40 GMT
- Title: Estimating the Number of Components in Finite Mixture Models via the
Group-Sort-Fuse Procedure
- Authors: Tudor Manole, Abbas Khalili
- Abstract summary: Group-Sort-Fuse (GSF) is a new penalized likelihood approach for simultaneous estimation of the order and mixing measure in finite mixture models.
We show that the GSF is consistent in estimating the true mixture order and the $n-1/2$ convergence rate for parameter estimation up to polylogarithmic factors.
- Score: 0.974672460306765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimation of the number of components (or order) of a finite mixture model
is a long standing and challenging problem in statistics. We propose the
Group-Sort-Fuse (GSF) procedure -- a new penalized likelihood approach for
simultaneous estimation of the order and mixing measure in multidimensional
finite mixture models. Unlike methods which fit and compare mixtures with
varying orders using criteria involving model complexity, our approach directly
penalizes a continuous function of the model parameters. More specifically,
given a conservative upper bound on the order, the GSF groups and sorts mixture
component parameters to fuse those which are redundant. For a wide range of
finite mixture models, we show that the GSF is consistent in estimating the
true mixture order and achieves the $n^{-1/2}$ convergence rate for parameter
estimation up to polylogarithmic factors. The GSF is implemented for several
univariate and multivariate mixture models in the R package GroupSortFuse. Its
finite sample performance is supported by a thorough simulation study, and its
application is illustrated on two real data examples.
Related papers
- Adaptive Fuzzy C-Means with Graph Embedding [84.47075244116782]
Fuzzy clustering algorithms can be roughly categorized into two main groups: Fuzzy C-Means (FCM) based methods and mixture model based methods.
We propose a novel FCM based clustering model that is capable of automatically learning an appropriate membership degree hyper- parameter value.
arXiv Detail & Related papers (2024-05-22T08:15:50Z) - Estimating the Number of Components in Finite Mixture Models via Variational Approximation [8.468023518807408]
We introduce a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes.
We establish matching upper and lower bounds for the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation.
As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution.
arXiv Detail & Related papers (2024-04-25T17:00:24Z) - A Fourier Approach to the Parameter Estimation Problem for One-dimensional Gaussian Mixture Models [21.436254507839738]
We propose a novel algorithm for estimating parameters in one-dimensional Gaussian mixture models.
We show that our algorithm achieves better scores in likelihood, AIC, and BIC when compared to the EM algorithm.
arXiv Detail & Related papers (2024-04-19T03:53:50Z) - Gaussian Mixture Solvers for Diffusion Models [84.83349474361204]
We introduce a novel class of SDE-based solvers called GMS for diffusion models.
Our solver outperforms numerous SDE-based solvers in terms of sample quality in image generation and stroke-based synthesis.
arXiv Detail & Related papers (2023-11-02T02:05:38Z) - Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly.
A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks.
We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z) - A non-asymptotic model selection in block-diagonal mixture of polynomial
experts models [1.491109220586182]
We introduce a penalized maximum likelihood selection criterion to estimate the unknown conditional density of a regression model.
We provide a strong theoretical guarantee, including a finite-sample oracle satisfied by the penalized maximum likelihood with a Jensen-Kullback-Leibler type loss.
arXiv Detail & Related papers (2021-04-18T21:32:20Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Robust Finite Mixture Regression for Heterogeneous Targets [70.19798470463378]
We propose an FMR model that finds sample clusters and jointly models multiple incomplete mixed-type targets simultaneously.
We provide non-asymptotic oracle performance bounds for our model under a high-dimensional learning framework.
The results show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2020-10-12T03:27:07Z) - Learning Mixtures of Permutations: Groups of Pairwise Comparisons and
Combinatorial Method of Moments [8.691957530860675]
We study the widely used Mallows mixture model.
In the high-dimensional setting, we propose an optimal-time algorithm that learns a Mallows mixture of permutations on $n$ elements.
arXiv Detail & Related papers (2020-09-14T23:11:46Z) - Consistent Estimation of Identifiable Nonparametric Mixture Models from
Grouped Observations [84.81435917024983]
This work proposes an algorithm that consistently estimates any identifiable mixture model from grouped observations.
A practical implementation is provided for paired observations, and the approach is shown to outperform existing methods.
arXiv Detail & Related papers (2020-06-12T20:44:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.