Matrix Factorization Framework for Community Detection under the Degree-Corrected Block Model
- URL: http://arxiv.org/abs/2601.06262v1
- Date: Fri, 09 Jan 2026 19:16:29 GMT
- Title: Matrix Factorization Framework for Community Detection under the Degree-Corrected Block Model
- Authors: Alexandra Dache, Arnaud Vandaele, Nicolas Gillis,
- Abstract summary: We show that DCBM inference can be reformulated as a constrained nonnegative matrix factorization problem.<n>Our approach is to any specific network structure and applies to graphs with any structure representable by a DCBM.<n> Experiments on synthetic and real benchmark networks show that our method detects communities comparable to those found by DCBM inference.
- Score: 48.989531198582704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Community detection is a fundamental task in data analysis. Block models form a standard approach to partition nodes according to a graph model, facilitating the analysis and interpretation of the network structure. By grouping nodes with similar connection patterns, they enable the identification of a wide variety of underlying structures. The degree-corrected block model (DCBM) is an established model that accounts for the heterogeneity of node degrees. However, existing inference methods for the DCBM are heuristics that are highly sensitive to initialization, typically done randomly. In this work, we show that DCBM inference can be reformulated as a constrained nonnegative matrix factorization problem. Leveraging this insight, we propose a novel method for community detection and a theoretically well-grounded initialization strategy that provides an initial estimate of communities for inference algorithms. Our approach is agnostic to any specific network structure and applies to graphs with any structure representable by a DCBM, not only assortative ones. Experiments on synthetic and real benchmark networks show that our method detects communities comparable to those found by DCBM inference, while scaling linearly with the number of edges and communities; for instance, it processes a graph with 100,000 nodes and 2,000,000 edges in approximately 4 minutes. Moreover, the proposed initialization strategy significantly improves solution quality and reduces the number of iterations required by all tested inference algorithms. Overall, this work provides a scalable and robust framework for community detection and highlights the benefits of a matrix-factorization perspective for the DCBM.
Related papers
- Variational Bayesian Flow Network for Graph Generation [54.94088904387278]
We propose Variational Bayesian Flow Network (VBFN) for graph generation.<n>VBFN performs variational lifting to a tractable joint Gaussian variational belief family governed by structured precisions.<n>On synthetic and molecular graph datasets, VBFN improves fidelity and diversity, and surpasses baseline methods.
arXiv Detail & Related papers (2026-01-30T03:59:38Z) - Model-free algorithms for fast node clustering in SBM type graphs and application to social role inference in animals [26.41190755089919]
We propose a novel family of model-free algorithms for node clustering and parameter inference in graphs generated from the Block Model (SBM)<n>We benchmark our methods against state-of-the-art techniques, demonstrating significantly faster computation times with the lower order of estimation error.<n>We validate the practical relevance of our algorithms by applying them to empirical network data from behavioral ecology.
arXiv Detail & Related papers (2025-09-19T13:57:17Z) - Learning Discrete Bayesian Networks with Hierarchical Dirichlet Shrinkage [52.914168158222765]
We detail a comprehensive Bayesian framework for learning DBNs.<n>We give a novel Markov chain Monte Carlo (MCMC) algorithm utilizing parallel Langevin proposals to generate exact posterior samples.<n>We apply our methodology to uncover prognostic network structure from primary breast cancer samples.
arXiv Detail & Related papers (2025-09-16T17:24:35Z) - Simultaneous estimation of connectivity and dimensionality in samples of networks [1.8874301050354771]
This paper proposes a method to simultaneously estimate a latent matrix of connectivity probabilities and its embedding dimensionality or rank.<n> Numerical studies empirically demonstrate the accuracy of our method across various scenarios.
arXiv Detail & Related papers (2025-08-17T19:52:08Z) - Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition.
Recent data-driven approaches have improved the graph matching accuracy remarkably.
We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z) - Implicit models, latent compression, intrinsic biases, and cheap lunches
in community detection [0.0]
Community detection aims to partition a network into clusters of nodes to summarize its large-scale structure.
Some community detection methods are inferential, explicitly deriving the clustering objective through a probabilistic generative model.
Other methods are descriptive, dividing a network according to an objective motivated by a particular application.
We present a solution that associates any community detection objective, inferential or descriptive, with its corresponding implicit network generative model.
arXiv Detail & Related papers (2022-10-17T15:38:41Z) - Improving Metric Dimensionality Reduction with Distributed Topology [68.8204255655161]
DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term.
We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets.
arXiv Detail & Related papers (2021-06-14T17:19:44Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - Amortized Probabilistic Detection of Communities in Graphs [39.56798207634738]
We propose a simple framework for amortized community detection.
We combine the expressive power of GNNs with recent methods for amortized clustering.
We evaluate several models from our framework on synthetic and real datasets.
arXiv Detail & Related papers (2020-10-29T16:18:48Z) - Extended Stochastic Block Models with Application to Criminal Networks [3.2211782521637393]
We study covert networks that encode relationships among criminals.
The coexistence of noisy block patterns limits the reliability of routinely-used community detection algorithms.
We develop a new class of extended block models (ESBM) that infer groups of nodes having common connectivity patterns.
arXiv Detail & Related papers (2020-07-16T19:06:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.