Mixture of multilayer stochastic block models for multiview clustering
- URL: http://arxiv.org/abs/2401.04682v1
- Date: Tue, 9 Jan 2024 17:15:47 GMT
- Title: Mixture of multilayer stochastic block models for multiview clustering
- Authors: Kylliann De Santiago, Marie Szafranski, Christophe Ambroise
- Abstract summary: We propose an original method for aggregating multiple clustering coming from different sources of information.
The identifiability of the model parameters is established and a variational Bayesian EM algorithm is proposed for the estimation of these parameters.
The method is utilized to analyze global food trading networks, leading to structures of interest.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we propose an original method for aggregating multiple
clustering coming from different sources of information. Each partition is
encoded by a co-membership matrix between observations. Our approach uses a
mixture of multilayer Stochastic Block Models (SBM) to group co-membership
matrices with similar information into components and to partition observations
into different clusters, taking into account their specificities within the
components. The identifiability of the model parameters is established and a
variational Bayesian EM algorithm is proposed for the estimation of these
parameters. The Bayesian framework allows for selecting an optimal number of
clusters and components. The proposed approach is compared using synthetic data
with consensus clustering and tensor-based algorithms for community detection
in large-scale complex networks. Finally, the method is utilized to analyze
global food trading networks, leading to structures of interest.
Related papers
- Similarity and Dissimilarity Guided Co-association Matrix Construction for Ensemble Clustering [22.280221709474105]
We propose the Similarity and Dissimilarity Guided Co-association matrix (SDGCA) to achieve ensemble clustering.
First, we introduce normalized ensemble entropy to estimate the quality of each cluster, and construct a similarity matrix based on this estimation.
We employ the random walk to explore high-order proximity of base clusterings to construct a dissimilarity matrix.
arXiv Detail & Related papers (2024-11-01T08:10:28Z) - Multi-view clustering integrating anchor attribute and structural information [1.4750411676439674]
This paper introduces a novel multi-view clustering algorithm, AAS.
It utilizes a two-step proximity approach via anchors in each view, integrating attribute and directed structural information.
arXiv Detail & Related papers (2024-10-29T03:53:03Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - A parallelizable model-based approach for marginal and multivariate
clustering [0.0]
This paper develops a clustering method that takes advantage of the sturdiness of model-based clustering.
We tackle this issue by specifying a finite mixture model per margin that allows each margin to have a different number of clusters.
The proposed approach is computationally appealing as well as more tractable for moderate to high dimensions than a full' (joint) model-based clustering approach.
arXiv Detail & Related papers (2022-12-07T23:54:41Z) - clusterBMA: Bayesian model averaging for clustering [1.2021605201770345]
We introduce clusterBMA, a method that enables weighted model averaging across results from unsupervised clustering algorithms.
We use clustering internal validation criteria to develop an approximation of the posterior model probability, used for weighting the results from each model.
In addition to outperforming other ensemble clustering methods on simulated data, clusterBMA offers unique features including probabilistic allocation to averaged clusters.
arXiv Detail & Related papers (2022-09-09T04:55:20Z) - Personalized Federated Learning via Convex Clustering [72.15857783681658]
We propose a family of algorithms for personalized federated learning with locally convex user costs.
The proposed framework is based on a generalization of convex clustering in which the differences between different users' models are penalized.
arXiv Detail & Related papers (2022-02-01T19:25:31Z) - Determinantal consensus clustering [77.34726150561087]
We propose the use of determinantal point processes or DPP for the random restart of clustering algorithms.
DPPs favor diversity of the center points within subsets.
We show through simulations that, contrary to DPP, this technique fails both to ensure diversity, and to obtain a good coverage of all data facets.
arXiv Detail & Related papers (2021-02-07T23:48:24Z) - Clustering Ensemble Meets Low-rank Tensor Approximation [50.21581880045667]
This paper explores the problem of clustering ensemble, which aims to combine multiple base clusterings to produce better performance than that of the individual one.
We propose a novel low-rank tensor approximation-based method to solve the problem from a global perspective.
Experimental results over 7 benchmark data sets show that the proposed model achieves a breakthrough in clustering performance, compared with 12 state-of-the-art methods.
arXiv Detail & Related papers (2020-12-16T13:01:37Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Conjoined Dirichlet Process [63.89763375457853]
We develop a novel, non-parametric probabilistic biclustering method based on Dirichlet processes to identify biclusters with strong co-occurrence in both rows and columns.
We apply our method to two different applications, text mining and gene expression analysis, and demonstrate that our method improves bicluster extraction in many settings compared to existing approaches.
arXiv Detail & Related papers (2020-02-08T19:41:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.