Optimal Graph Clustering without Edge Density Signals
- URL: http://arxiv.org/abs/2510.21669v1
- Date: Fri, 24 Oct 2025 17:24:26 GMT
- Title: Optimal Graph Clustering without Edge Density Signals
- Authors: Maximilien Dreveton, Elaine Siyu Liu, Matthias Grossglauser, Patrick Thiran,
- Abstract summary: This paper establishes the theoretical limits of graph clustering under the Popularity-Adjusted Block Model (PABM)<n>Our main contribution is the characterization of the optimal error rate for clustering under PABM.<n>Our numerical experiments on both synthetic and real datasets confirm that spectral clustering algorithms incorporating $k2$ eigenvectors outperform traditional spectral approaches.
- Score: 11.198418635553976
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper establishes the theoretical limits of graph clustering under the Popularity-Adjusted Block Model (PABM), addressing limitations of existing models. In contrast to the Stochastic Block Model (SBM), which assumes uniform vertex degrees, and to the Degree-Corrected Block Model (DCBM), which applies uniform degree corrections across clusters, PABM introduces separate popularity parameters for intra- and inter-cluster connections. Our main contribution is the characterization of the optimal error rate for clustering under PABM, which provides novel insights on clustering hardness: we demonstrate that unlike SBM and DCBM, cluster recovery remains possible in PABM even when traditional edge-density signals vanish, provided intra- and inter-cluster popularity coefficients differ. This highlights a dimension of degree heterogeneity captured by PABM but overlooked by DCBM: local differences in connectivity patterns can enhance cluster separability independently of global edge densities. Finally, because PABM exhibits a richer structure, its expected adjacency matrix has rank between $k$ and $k^2$, where $k$ is the number of clusters. As a result, spectral embeddings based on the top $k$ eigenvectors may fail to capture important structural information. Our numerical experiments on both synthetic and real datasets confirm that spectral clustering algorithms incorporating $k^2$ eigenvectors outperform traditional spectral approaches.
Related papers
- Universal NP-Hardness of Clustering under General Utilities [11.62669179647184]
We formalise the common optimisation core motivating a diverse-time computable partition utility over a finite metric space.<n>By mapping ten major paradigms -- including k-means, GMMs, DBSCAN, spectral clustering, and affinity propagation -- to the UCP framework, we demonstrate that each inherits this fundamental inability.
arXiv Detail & Related papers (2026-02-27T13:08:15Z) - Graph-based Clustering Revisited: A Relaxation of Kernel $k$-Means Perspective [73.18641268511318]
We propose a graph-based clustering algorithm that only relaxes the orthonormal constraint to derive clustering results.<n>To ensure a doubly constraint into a gradient, we transform the non-negative constraint into a class probability parameter.
arXiv Detail & Related papers (2025-09-23T09:14:39Z) - MADCluster: Model-agnostic Anomaly Detection with Self-supervised Clustering Network [0.7373617024876725]
We propose MADCluster, a model-agnostic anomaly detection framework utilizing self-supervised clustering.<n>The core idea is to cluster normal pattern data into a'single cluster' while simultaneously learning the cluster center and mapping data close to this center.<n>Experiments on four time series benchmark datasets demonstrate that applying MADCluster improves the overall performance of comparative models.
arXiv Detail & Related papers (2025-05-22T04:50:44Z) - Sharper Error Bounds in Late Fusion Multi-view Clustering Using Eigenvalue Proportion [19.46433323866636]
Multi-view clustering (MVC) aims to integrate complementary information from multiple views to enhance clustering performance.<n>We present a novel theoretical framework for analyzing the generalization error bounds of multiple kernel $k$-means.<n>We propose a low-pass graph filtering strategy within a multiple linear $k$-means framework to mitigate noise and redundancy.
arXiv Detail & Related papers (2024-12-24T06:24:08Z) - Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Spectral Clustering for Directed Graphs via Likelihood Estimation on Stochastic Block Models [22.421702511126373]
We leverage statistical inference on block models to guide the development of a spectral clustering algorithm for directed graphs.<n>We establish a theoretical upper bound on the misclustering error of its spectral relaxation, and based on this relaxation, introduce a novel, self-adaptive spectral clustering method for directed graphs.
arXiv Detail & Related papers (2024-03-28T15:47:13Z) - Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [85.51611950757643]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.<n>IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF)
Our model directly considers the between-view relationship and exploits the between-view complementary information.
In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z) - Likelihood Adjusted Semidefinite Programs for Clustering Heterogeneous
Data [16.153709556346417]
Clustering is a widely deployed learning tool.
iLA-SDP is less sensitive than EM to and more stable on high-dimensional data.
arXiv Detail & Related papers (2022-09-29T21:03:13Z) - Local Sample-weighted Multiple Kernel Clustering with Consensus
Discriminative Graph [73.68184322526338]
Multiple kernel clustering (MKC) is committed to achieving optimal information fusion from a set of base kernels.
This paper proposes a novel local sample-weighted multiple kernel clustering model.
Experimental results demonstrate that our LSWMKC possesses better local manifold representation and outperforms existing kernel or graph-based clustering algo-rithms.
arXiv Detail & Related papers (2022-07-05T05:00:38Z) - Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC)
We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss.
Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z) - A class of network models recoverable by spectral clustering [11.635152494912003]
We show that essentially the same algorithm used for the Block-Model (SBM) works on a wider class of Block-Models.
We introduce clearly the free parameters needed to specify this class of models, and results in bounds that expose with more clarity the parameters that control the recovery error in this model class.
arXiv Detail & Related papers (2021-04-21T04:22:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.