Algorithme EM r\'egularis\'e
- URL: http://arxiv.org/abs/2307.01955v1
- Date: Tue, 4 Jul 2023 23:19:25 GMT
- Title: Algorithme EM r\'egularis\'e
- Authors: Pierre Houdouin and Matthieu Jonkcheere and Frederic Pascal
- Abstract summary: This paper presents a regularized version of the EM algorithm that efficiently uses prior knowledge to cope with a small sample size.
Experiments on real data highlight the good performance of the proposed algorithm for clustering purposes.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Expectation-Maximization (EM) algorithm is a widely used iterative algorithm
for computing maximum likelihood estimate when dealing with Gaussian Mixture
Model (GMM). When the sample size is smaller than the data dimension, this
could lead to a singular or poorly conditioned covariance matrix and, thus, to
performance reduction. This paper presents a regularized version of the EM
algorithm that efficiently uses prior knowledge to cope with a small sample
size. This method aims to maximize a penalized GMM likelihood where regularized
estimation may ensure positive definiteness of covariance matrix updates by
shrinking the estimators towards some structured target covariance matrices.
Finally, experiments on real data highlight the good performance of the
proposed algorithm for clustering purposes
Related papers
- An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Regularized EM algorithm [9.367612782346205]
We present a regularized EM algorithm for GMM-s that can make efficient use of such prior knowledge as well as cope with LSS situations.
We show that the theoretical guarantees of convergence hold, leading to better performing EM algorithm for structured covariance matrix models or with low sample settings.
arXiv Detail & Related papers (2023-03-27T08:32:20Z) - Regularization and Optimization in Model-Based Clustering [4.096453902709292]
k-means algorithm variants essentially fit a mixture of identical spherical Gaussians to data that vastly deviates from such a distribution.
We develop more effective optimization algorithms for general GMMs, and we combine these algorithms with regularization strategies that avoid overfitting.
These results shed new light on the current status quo between GMM and k-means methods and suggest the more frequent use of general GMMs for data exploration.
arXiv Detail & Related papers (2023-02-05T18:22:29Z) - Stochastic First-Order Learning for Large-Scale Flexibly Tied Gaussian
Mixture Model [3.4546761246181696]
We propose a new optimization algorithm on the manifold of Gaussian Mixture Models (GMMs)
We observe that methods can outperform the expectation-maximization algorithm in terms of attaining better likelihood, needing fewer epochs for convergence, and consuming less time per each epoch.
arXiv Detail & Related papers (2022-12-11T04:24:52Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient
Descent [8.714458129632158]
Kolmogorov model (KM) is an interpretable and predictable representation approach to learning the underlying probabilistic structure of a set of random variables.
We propose a computationally scalable KM learning algorithm, based on the regularized dual optimization combined with enhanced gradient descent (GD) method.
It is shown that the accuracy of logical relation mining for interpretability by using the proposed KM learning algorithm exceeds $80%$.
arXiv Detail & Related papers (2021-07-11T10:33:02Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Revisiting Co-Occurring Directions: Sharper Analysis and Efficient
Algorithm for Sparse Matrices [23.22254890452548]
We study the streaming model for approximate matrix multiplication (AMM)
We are interested in the scenario that the algorithm can only take one pass over the data with limited memory.
The state-of-the-art deterministic sketching algorithm for streaming AMM is the co-occurring directions (COD)
arXiv Detail & Related papers (2020-09-05T15:35:59Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.