Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities
- URL: http://arxiv.org/abs/2509.15822v2
- Date: Thu, 06 Nov 2025 20:04:05 GMT
- Title: Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities
- Authors: Alexandra Carpentier, Christophe Giraud, Nicolas Verzelen,
- Abstract summary: Predictions from statistical physics postulate that recovery of the communities in Block Model (SBM) is possible in time above, and only above, the KestenStigum (KS) threshold.<n>Chin et al.(2025) recently prove that, in a sparse regime, community recovery in time is possible below the KS threshold by counting nonbacktracking paths.
- Score: 51.320599504997745
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predictions from statistical physics postulate that recovery of the communities in Stochastic Block Model (SBM) is possible in polynomial time above, and only above, the Kesten-Stigum (KS) threshold. This conjecture has given rise to a rich literature, proving that non-trivial community recovery is indeed possible in SBM above the KS threshold, as long as the number $K$ of communities remains smaller than $\sqrt{n}$, where $n$ is the number of nodes in the observed graph. Failure of low-degree polynomials below the KS threshold was also proven when $K=o(\sqrt{n})$. When $K\geq \sqrt{n}$, Chin et al.(2025) recently prove that, in a sparse regime, community recovery in polynomial time is possible below the KS threshold by counting non-backtracking paths. This breakthrough result lead them to postulate a new threshold for the many communities regime $K\geq \sqrt{n}$. In this work, we provide evidences that confirm their conjecture for $K\geq \sqrt{n}$: 1- We prove that, for any density of the graph, low-degree polynomials fail to recover communities below the threshold postulated by Chin et al.(2025); 2- We prove that community recovery is possible in polynomial time above the postulated threshold, not only in the sparse regime of~Chin et al., but also in some (but not all) moderately sparse regimes by essentially by counting occurrences of cliques or self-avoiding paths of suitable size in the observed graph. In addition, we propose a detailed conjecture regarding the structure of motifs that are optimal in sparsity regimes not covered by cliques or self-avoiding paths counting. In particular, counting self-avoiding paths of length $\log(n)$--which is closely related to spectral algorithms based on the Non-Backtracking operator--is optimal only in the sparse regime. Other motif counts--unrelated to spectral properties--should be considered in denser regimes.
Related papers
- Fundamental Limits of Community Detection in Contextual Multi-Layer Stochastic Block Models [4.312207944236305]
We consider the problem of community detection from the joint observation of a high-dimensional covariate matrix and $L$ sparse networks.<n>In the sparse regime where the networks have constant average degree and the number of features $p$ grows proportionally with $n$, we derive a sharp threshold under which detecting and estimating the subject labels is possible.<n>Our results show that there is no statistical-computational gap in this setting.
arXiv Detail & Related papers (2026-02-09T00:27:54Z) - Almost Asymptotically Optimal Active Clustering Through Pairwise Observations [59.20614082241528]
We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses.<n>We establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the accuracy of the clustering.<n>We develop a computationally feasible variant of the Generalized Likelihood Ratio statistic and show that its performance gap to the lower bound can be accurately empirically estimated.
arXiv Detail & Related papers (2026-02-05T14:16:47Z) - Phase Transition for Stochastic Block Model with more than $\sqrt{n}$ Communities (II) [51.320599504997745]
We show that when the number $K$ of communities remains smaller than $sqrtn$, non-trivial community recovery is possible in time above the Kesten--Stigum threshold.<n>We also show that, in moderately sparse settings, the optimal algorithms appear to be fundamentally different from spectral methods.
arXiv Detail & Related papers (2025-11-26T15:54:17Z) - Near-Optimal Clustering in Mixture of Markov Chains [74.3828414695655]
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$.<n>We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains.<n>We then present a novel two-stage clustering algorithm.
arXiv Detail & Related papers (2025-06-02T05:10:40Z) - Low degree conjecture implies sharp computational thresholds in stochastic block model [3.7873597471903935]
We investigate implications of the (extended) low-degree conjecture (recently formalized in [MW23] in the context of the symmetric block model)<n>We establish that no-time algorithm can weakly recover community labels below the Kesten-Stigum (KS) threshold.<n>Our results provide evidence of a computational-to-statistical gap in learning the parameters of block models.
arXiv Detail & Related papers (2025-02-20T20:21:03Z) - Harnessing Multiple Correlated Networks for Exact Community Recovery [6.092631511453874]
We study the problem of learning latent community structure from multiple correlated networks.<n>Recent work of Gaudio, R'acz, and Sridhar (COLT 2022) determined the precise information-theoretic threshold for exact community recovery.
arXiv Detail & Related papers (2024-12-03T19:56:52Z) - Community Detection in the Multi-View Stochastic Block Model [47.43048484980115]
This paper considers the problem of community detection on multiple potentially correlated graphs from antheoretical perspective.
We first put forth a random graph model, called the multi-view information block model (MVSBM), designed to generate correlated graphs on the same set of nodes.
arXiv Detail & Related papers (2024-01-17T13:39:38Z) - Lower Bounds for Learning in Revealing POMDPs [88.23337313766355]
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging emphpartially observable setting.
For emphmulti-step revealing POMDPs, we show that the latent state-space dependence is at least $Omega(S1.5)$ in the sample complexity.
arXiv Detail & Related papers (2023-02-02T18:59:30Z) - Exact Recovery in the General Hypergraph Stochastic Block Model [92.28929858529679]
This paper investigates fundamental limits of exact recovery in the general d-uniform hypergraph block model (d-HSBM)
We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below it.
arXiv Detail & Related papers (2021-05-11T03:39:08Z) - Free Energy Wells and Overlap Gap Property in Sparse PCA [81.64027805404483]
We study a variant of the sparse PCA (principal component analysis) problem in the "hard" regime.
We show bounds on the depth of free energy wells for various Gibbs measures naturally associated to the problem.
We prove that the Overlap Gap Property (OGP) holds in a significant part of the hard regime.
arXiv Detail & Related papers (2020-06-18T17:18:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.