An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture
Models
- URL: http://arxiv.org/abs/2009.12703v1
- Date: Sat, 26 Sep 2020 22:55:44 GMT
- Title: An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture
Models
- Authors: Truong Nguyen, Guangye Chen, and Luis Chacon
- Abstract summary: We propose an Anderson Acceleration scheme for the adaptive Expectation-Maximization (EM) algorithm for unsupervised learning.
The proposed algorithm is able to determine the optimal number of mixture components autonomously, and converges to the optimal solution much faster than its non-accelerated version.
- Score: 0.7340845393655052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an Anderson Acceleration (AA) scheme for the adaptive
Expectation-Maximization (EM) algorithm for unsupervised learning a finite
mixture model from multivariate data (Figueiredo and Jain 2002). The proposed
algorithm is able to determine the optimal number of mixture components
autonomously, and converges to the optimal solution much faster than its
non-accelerated version. The success of the AA-based algorithm stems from
several developments rather than a single breakthrough (and without these, our
tests demonstrate that AA fails catastrophically). To begin, we ensure the
monotonicity of the likelihood function (a the key feature of the standard EM
algorithm) with a recently proposed monotonicity-control algorithm (Henderson
and Varahdan 2019), enhanced by a novel monotonicity test with little overhead.
We propose nimble strategies for AA to preserve the positive definiteness of
the Gaussian weights and covariance matrices strictly, and to conserve up to
the second moments of the observed data set exactly. Finally, we employ a
K-means clustering algorithm using the gap statistic to avoid excessively
overestimating the initial number of components, thereby maximizing
performance. We demonstrate the accuracy and efficiency of the algorithm with
several synthetic data sets that are mixtures of Gaussians distributions of
known number of components, as well as data sets generated from
particle-in-cell simulations. Our numerical results demonstrate speed-ups with
respect to non-accelerated EM of up to 60X when the exact number of mixture
components is known, and between a few and more than an order of magnitude with
component adaptivity.
Related papers
- A Fourier Approach to the Parameter Estimation Problem for One-dimensional Gaussian Mixture Models [21.436254507839738]
We propose a novel algorithm for estimating parameters in one-dimensional Gaussian mixture models.
We show that our algorithm achieves better scores in likelihood, AIC, and BIC when compared to the EM algorithm.
arXiv Detail & Related papers (2024-04-19T03:53:50Z) - Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing.
We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Detection of Anomalies in Multivariate Time Series Using Ensemble
Techniques [3.2422067155309806]
We propose an ensemble technique that combines multiple base models toward the final decision.
A semi-supervised approach using a Logistic Regressor to combine the base models' outputs is also proposed.
The performance improvement in terms of anomaly detection accuracy reaches 2% for the unsupervised and at least 10% for the semi-supervised models.
arXiv Detail & Related papers (2023-08-06T17:51:22Z) - Algorithme EM r\'egularis\'e [0.0]
This paper presents a regularized version of the EM algorithm that efficiently uses prior knowledge to cope with a small sample size.
Experiments on real data highlight the good performance of the proposed algorithm for clustering purposes.
arXiv Detail & Related papers (2023-07-04T23:19:25Z) - A distribution-free mixed-integer optimization approach to hierarchical modelling of clustered and longitudinal data [0.0]
We introduce an innovative algorithm that evaluates cluster effects for new data points, thereby increasing the robustness and precision of this model.
The inferential and predictive efficacy of this approach is further illustrated through its application in student scoring and protein expression.
arXiv Detail & Related papers (2023-02-06T23:34:51Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - A Non-Parametric Bootstrap for Spectral Clustering [0.7673339435080445]
We develop two novel algorithms that incorporate the spectral decomposition of the data matrix and a non-parametric bootstrap sampling scheme.
Our techniques are more consistent in their convergence when compared to other bootstrapped algorithms that fit finite mixture models.
arXiv Detail & Related papers (2022-09-13T08:37:05Z) - Plug-And-Play Learned Gaussian-mixture Approximate Message Passing [71.74028918819046]
We propose a plug-and-play compressed sensing (CS) recovery algorithm suitable for any i.i.d. source prior.
Our algorithm builds upon Borgerding's learned AMP (LAMP), yet significantly improves it by adopting a universal denoising function within the algorithm.
Numerical evaluation shows that the L-GM-AMP algorithm achieves state-of-the-art performance without any knowledge of the source prior.
arXiv Detail & Related papers (2020-11-18T16:40:45Z) - Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification.
We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.