Related papers: An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture Models

An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture Models

URL: http://arxiv.org/abs/2009.12703v1
Date: Sat, 26 Sep 2020 22:55:44 GMT
Title: An Adaptive EM Accelerator for Unsupervised Learning of Gaussian Mixture Models
Authors: Truong Nguyen, Guangye Chen, and Luis Chacon
Abstract summary: We propose an Anderson Acceleration scheme for the adaptive Expectation-Maximization (EM) algorithm for unsupervised learning. The proposed algorithm is able to determine the optimal number of mixture components autonomously, and converges to the optimal solution much faster than its non-accelerated version.
Score: 0.7340845393655052
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose an Anderson Acceleration (AA) scheme for the adaptive Expectation-Maximization (EM) algorithm for unsupervised learning a finite mixture model from multivariate data (Figueiredo and Jain 2002). The proposed algorithm is able to determine the optimal number of mixture components autonomously, and converges to the optimal solution much faster than its non-accelerated version. The success of the AA-based algorithm stems from several developments rather than a single breakthrough (and without these, our tests demonstrate that AA fails catastrophically). To begin, we ensure the monotonicity of the likelihood function (a the key feature of the standard EM algorithm) with a recently proposed monotonicity-control algorithm (Henderson and Varahdan 2019), enhanced by a novel monotonicity test with little overhead. We propose nimble strategies for AA to preserve the positive definiteness of the Gaussian weights and covariance matrices strictly, and to conserve up to the second moments of the observed data set exactly. Finally, we employ a K-means clustering algorithm using the gap statistic to avoid excessively overestimating the initial number of components, thereby maximizing performance. We demonstrate the accuracy and efficiency of the algorithm with several synthetic data sets that are mixtures of Gaussians distributions of known number of components, as well as data sets generated from particle-in-cell simulations. Our numerical results demonstrate speed-ups with respect to non-accelerated EM of up to 60X when the exact number of mixture components is known, and between a few and more than an order of magnitude with component adaptivity.

Related papers

A Fourier Approach to the Parameter Estimation Problem for One-dimensional Gaussian Mixture Models [21.436254507839738]
We propose a novel algorithm for estimating parameters in one-dimensional Gaussian mixture models. We show that our algorithm achieves better scores in likelihood, AIC, and BIC when compared to the EM algorithm.
arXiv Detail & Related papers (2024-04-19T03:53:50Z)
Fast Semisupervised Unmixing Using Nonconvex Optimization [80.11512905623417]
We introduce a novel convex convex model for semi/library-based unmixing. We demonstrate the efficacy of Alternating Methods of sparse unsupervised unmixing.
arXiv Detail & Related papers (2024-01-23T10:07:41Z)
An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks. The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions. We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z)
Detection of Anomalies in Multivariate Time Series Using Ensemble Techniques [3.2422067155309806]
We propose an ensemble technique that combines multiple base models toward the final decision. A semi-supervised approach using a Logistic Regressor to combine the base models' outputs is also proposed. The performance improvement in terms of anomaly detection accuracy reaches 2% for the unsupervised and at least 10% for the semi-supervised models.
arXiv Detail & Related papers (2023-08-06T17:51:22Z)
Algorithme EM r\'egularis\'e [0.0]
This paper presents a regularized version of the EM algorithm that efficiently uses prior knowledge to cope with a small sample size. Experiments on real data highlight the good performance of the proposed algorithm for clustering purposes.
arXiv Detail & Related papers (2023-07-04T23:19:25Z)
A distribution-free mixed-integer optimization approach to hierarchical modelling of clustered and longitudinal data [0.0]
We introduce an innovative algorithm that evaluates cluster effects for new data points, thereby increasing the robustness and precision of this model. The inferential and predictive efficacy of this approach is further illustrated through its application in student scoring and protein expression.
arXiv Detail & Related papers (2023-02-06T23:34:51Z)
Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture. It can model the feature space more comprehensively and reduce the dominance of head classes. The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z)
A Non-Parametric Bootstrap for Spectral Clustering [0.7673339435080445]
We develop two novel algorithms that incorporate the spectral decomposition of the data matrix and a non-parametric bootstrap sampling scheme. Our techniques are more consistent in their convergence when compared to other bootstrapped algorithms that fit finite mixture models.
arXiv Detail & Related papers (2022-09-13T08:37:05Z)
Plug-And-Play Learned Gaussian-mixture Approximate Message Passing [71.74028918819046]
We propose a plug-and-play compressed sensing (CS) recovery algorithm suitable for any i.i.d. source prior. Our algorithm builds upon Borgerding's learned AMP (LAMP), yet significantly improves it by adopting a universal denoising function within the algorithm. Numerical evaluation shows that the L-GM-AMP algorithm achieves state-of-the-art performance without any knowledge of the source prior.
arXiv Detail & Related papers (2020-11-18T16:40:45Z)
Stochastic Hard Thresholding Algorithms for AUC Maximization [49.00683387735522]
We develop a hard thresholding algorithm for AUC in distributiond classification. We conduct experiments to show the efficiency and effectiveness of the proposed algorithms.
arXiv Detail & Related papers (2020-11-04T16:49:29Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.