On spectral algorithms for community detection in stochastic blockmodel
graphs with vertex covariates
- URL: http://arxiv.org/abs/2007.02156v3
- Date: Wed, 4 Aug 2021 02:50:08 GMT
- Title: On spectral algorithms for community detection in stochastic blockmodel
graphs with vertex covariates
- Authors: Cong Mu, Angelo Mele, Lingxin Hao, Joshua Cape, Avanti Athreya, Carey
E. Priebe
- Abstract summary: We present a comparative analysis of two model-based algorithms for clustering vertices in blockmodel graphs.
We show that the second algorithm has the advantages of revealing underlying block structure.
Our findings emphasize the importance of distinguishing between observed and unobserved factors that can affect block structure in graphs.
- Score: 6.029067308095145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In network inference applications, it is often desirable to detect community
structure, namely to cluster vertices into groups, or blocks, according to some
measure of similarity. Beyond mere adjacency matrices, many real networks also
involve vertex covariates that carry key information about underlying block
structure in graphs. To assess the effects of such covariates on block
recovery, we present a comparative analysis of two model-based spectral
algorithms for clustering vertices in stochastic blockmodel graphs with vertex
covariates. The first algorithm uses only the adjacency matrix, and directly
estimates the block assignments. The second algorithm incorporates both the
adjacency matrix and the vertex covariates into the estimation of block
assignments, and moreover quantifies the explicit impact of the vertex
covariates on the resulting estimate of the block assignments. We employ
Chernoff information to analytically compare the algorithms' performance and
derive the information-theoretic Chernoff ratio for certain models of interest.
Analytic results and simulations suggest that the second algorithm is often
preferred: we can often better estimate the induced block assignments by first
estimating the effect of vertex covariates. In addition, real data examples
also indicate that the second algorithm has the advantages of revealing
underlying block structure and taking observed vertex heterogeneity into
account in real applications. Our findings emphasize the importance of
distinguishing between observed and unobserved factors that can affect block
structure in graphs.
Related papers
- Quantization of Large Language Models with an Overdetermined Basis [73.79368761182998]
We introduce an algorithm for data quantization based on the principles of Kashin representation.
Our findings demonstrate that Kashin Quantization achieves competitive or superior quality in model performance.
arXiv Detail & Related papers (2024-04-15T12:38:46Z) - Interpretable Multi-View Clustering Based on Anchor Graph Tensor Factorization [64.00146569922028]
Multi-view clustering methods based on anchor graph factorization lack adequate cluster interpretability for the decomposed matrix.
We address this limitation by using non-negative tensor factorization to decompose an anchor graph tensor that combines anchor graphs from multiple views.
arXiv Detail & Related papers (2024-04-01T03:23:55Z) - Maximum Likelihood Estimation on Stochastic Blockmodels for Directed Graph Clustering [22.421702511126373]
We formulate clustering as estimating underlying communities in the directed block model.
We introduce two efficient and interpretable directed clustering algorithms, a spectral clustering algorithm and a semidefinite programming based clustering algorithm.
arXiv Detail & Related papers (2024-03-28T15:47:13Z) - Semi-Supervised Clustering of Sparse Graphs: Crossing the
Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data.
No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold.
We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent
Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies.
A neural network is employed to act as structure prior and reveal the underlying signal interdependencies.
Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z) - Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs [5.301300942803395]
Cut-based directed graph (digraph) clustering often focuses on finding dense within-cluster or sparse between-cluster connections.
For flow-based clusterings the edges between clusters tend to be oriented in one direction and have been found in migration data, food webs, and trade data.
arXiv Detail & Related papers (2022-03-02T20:07:04Z) - Riemannian classification of EEG signals with missing values [67.90148548467762]
This paper proposes two strategies to handle missing data for the classification of electroencephalograms.
The first approach estimates the covariance from imputed data with the $k$-nearest neighbors algorithm; the second relies on the observed data by leveraging the observed-data likelihood within an expectation-maximization algorithm.
As results show, the proposed strategies perform better than the classification based on observed data and allow to keep a high accuracy even when the missing data ratio increases.
arXiv Detail & Related papers (2021-10-19T14:24:50Z) - Robust Regularized Locality Preserving Indexing for Fiedler Vector
Estimation [32.26669925809068]
In real-world applications, the data may be subject to heavy-tailed noise and outliers which results in deteriorations in the structure of the Fiedler vector estimate.
We design a Robust Regularized Locality Preserving Indexing (RRLPI) method for Fiedler vector estimation that aims to approximate the nonlinear manifold structure of the Laplace Beltrami operator.
arXiv Detail & Related papers (2021-07-26T09:49:23Z) - Selective Inference for Latent Block Models [50.83356836818667]
This study provides a selective inference method for latent block models.
We construct a statistical test on a set of row and column cluster memberships of a latent block model.
The proposed exact and approximated tests work effectively, compared to the naive test that did not take the selective bias into account.
arXiv Detail & Related papers (2020-05-27T10:44:19Z) - Robust spectral clustering using LASSO regularization [0.0]
This paper presents a variant of spectral clustering, called 1-spectral clustering, performed on a new random model closely related to block model.
Its goal is to promote a sparse eigenbasis solution of a 1 minimization problem revealing the natural structure of the graph.
arXiv Detail & Related papers (2020-04-08T07:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.