EGMM: an Evidential Version of the Gaussian Mixture Model for Clustering
- URL: http://arxiv.org/abs/2010.01333v3
- Date: Wed, 7 Sep 2022 02:20:24 GMT
- Title: EGMM: an Evidential Version of the Gaussian Mixture Model for Clustering
- Authors: Lianmeng Jiao, Thierry Denoeux, Zhun-ga Liu, Quan Pan
- Abstract summary: We propose a new model-based clustering algorithm, called EGMM (evidential GMM), in the theoretical framework of belief functions.
The parameters in EGMM are estimated by a specially designed Expectation-Maximization (EM) algorithm.
The proposed EGMM is as simple as the classical GMM, but can generate a more informative evidential partition for the considered dataset.
- Score: 22.586481334904793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Gaussian mixture model (GMM) provides a simple yet principled framework
for clustering, with properties suitable for statistical inference. In this
paper, we propose a new model-based clustering algorithm, called EGMM
(evidential GMM), in the theoretical framework of belief functions to better
characterize cluster-membership uncertainty. With a mass function representing
the cluster membership of each object, the evidential Gaussian mixture
distribution composed of the components over the powerset of the desired
clusters is proposed to model the entire dataset. The parameters in EGMM are
estimated by a specially designed Expectation-Maximization (EM) algorithm. A
validity index allowing automatic determination of the proper number of
clusters is also provided. The proposed EGMM is as simple as the classical GMM,
but can generate a more informative evidential partition for the considered
dataset. The synthetic and real dataset experiments show that the proposed EGMM
performs better than other representative clustering algorithms. Besides, its
superiority is also demonstrated by an application to multi-modal brain image
segmentation.
Related papers
- Adaptive Fuzzy C-Means with Graph Embedding [84.47075244116782]
Fuzzy clustering algorithms can be roughly categorized into two main groups: Fuzzy C-Means (FCM) based methods and mixture model based methods.
We propose a novel FCM based clustering model that is capable of automatically learning an appropriate membership degree hyper- parameter value.
arXiv Detail & Related papers (2024-05-22T08:15:50Z) - Mixture of multilayer stochastic block models for multiview clustering [0.0]
We propose an original method for aggregating multiple clustering coming from different sources of information.
The identifiability of the model parameters is established and a variational Bayesian EM algorithm is proposed for the estimation of these parameters.
The method is utilized to analyze global food trading networks, leading to structures of interest.
arXiv Detail & Related papers (2024-01-09T17:15:47Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - On the properties of Gaussian Copula Mixture Models [0.0]
The paper presents the mathematical definition of GCMM and explores the properties of its likelihood function.
The paper proposes extended Expectation algorithms to estimate parameters for the mixture of copulas.
arXiv Detail & Related papers (2023-05-02T14:59:37Z) - Sparse and geometry-aware generalisation of the mutual information for joint discriminative clustering and feature selection [19.066989850964756]
We introduce a discriminative clustering model trying to maximise a geometry-aware generalisation of the mutual information called GEMINI.
This algorithm avoids the burden of feature exploration and is easily scalable to high-dimensional data and large amounts of samples while only designing a discriminative clustering model.
Our results show that Sparse GEMINI is a competitive algorithm and has the ability to select relevant subsets of variables with respect to the clustering without using relevance criteria or prior hypotheses.
arXiv Detail & Related papers (2023-02-07T10:52:04Z) - A distribution-free mixed-integer optimization approach to hierarchical modelling of clustered and longitudinal data [0.0]
We introduce an innovative algorithm that evaluates cluster effects for new data points, thereby increasing the robustness and precision of this model.
The inferential and predictive efficacy of this approach is further illustrated through its application in student scoring and protein expression.
arXiv Detail & Related papers (2023-02-06T23:34:51Z) - clusterBMA: Bayesian model averaging for clustering [1.2021605201770345]
We introduce clusterBMA, a method that enables weighted model averaging across results from unsupervised clustering algorithms.
We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model.
In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters.
arXiv Detail & Related papers (2022-09-09T04:55:20Z) - Tight integration of neural- and clustering-based diarization through
deep unfolding of infinite Gaussian mixture model [84.57667267657382]
This paper introduces a it trainable clustering algorithm into the integration framework.
Speaker embeddings are optimized during training such that it better fits iGMM clustering.
Experimental results show that the proposed approach outperforms the conventional approach in terms of diarization error rate.
arXiv Detail & Related papers (2022-02-14T07:45:21Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Clustering Binary Data by Application of Combinatorial Optimization
Heuristics [52.77024349608834]
We study clustering methods for binary data, first defining aggregation criteria that measure the compactness of clusters.
Five new and original methods are introduced, using neighborhoods and population behavior optimization metaheuristics.
From a set of 16 data tables generated by a quasi-Monte Carlo experiment, a comparison is performed for one of the aggregations using L1 dissimilarity, with hierarchical clustering, and a version of k-means: partitioning around medoids or PAM.
arXiv Detail & Related papers (2020-01-06T23:33:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.