Partial recovery and weak consistency in the non-uniform hypergraph Stochastic Block Model
- URL: http://arxiv.org/abs/2112.11671v3
- Date: Wed, 24 Apr 2024 21:55:34 GMT
- Title: Partial recovery and weak consistency in the non-uniform hypergraph Stochastic Block Model
- Authors: Ioana Dumitriu, Haixiao Wang, Yizhe Zhu,
- Abstract summary: We consider the community detection problem in random hypergraphs under the nonuniform hypergraph block model (HSBM)
We provide a spectral algorithm that outputs a partition with at least a $gamma$ fraction of the vertices classified correctly, where $gammain depends on the signal-to-noise ratio (SNR) of the model.
The theoretical analysis of our algorithm relies on the concentration and regularization of the adjacency matrix for non-uniform random hypergraphs, which can be of independent interest.
- Score: 6.681901523019242
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the community detection problem in sparse random hypergraphs under the non-uniform hypergraph stochastic block model (HSBM), a general model of random networks with community structure and higher-order interactions. When the random hypergraph has bounded expected degrees, we provide a spectral algorithm that outputs a partition with at least a $\gamma$ fraction of the vertices classified correctly, where $\gamma\in (0.5,1)$ depends on the signal-to-noise ratio (SNR) of the model. When the SNR grows slowly as the number of vertices goes to infinity, our algorithm achieves weak consistency, which improves the previous results in Ghoshdastidar and Dukkipati (2017) for non-uniform HSBMs. Our spectral algorithm consists of three major steps: (1) Hyperedge selection: select hyperedges of certain sizes to provide the maximal signal-to-noise ratio for the induced sub-hypergraph; (2) Spectral partition: construct a regularized adjacency matrix and obtain an approximate partition based on singular vectors; (3) Correction and merging: incorporate the hyperedge information from adjacency tensors to upgrade the error rate guarantee. The theoretical analysis of our algorithm relies on the concentration and regularization of the adjacency matrix for sparse non-uniform random hypergraphs, which can be of independent interest.
Related papers
- Strong consistency and optimality of spectral clustering in symmetric
binary non-uniform Hypergraph Stochastic Block Model [0.0]
We study the unsupervised classification problem in random hypergraphs under the non-uniform emphHypergraph Block Model (HSBM)
We find that strong consistency is achievable by aggregating information from all uniform layers, even if it is impossible when each layer is considered alone.
arXiv Detail & Related papers (2023-06-12T03:38:25Z) - Optimal and exact recovery on general non-uniform Hypergraph Stochastic Block Model [0.0]
We consider the community detection problem in random hypergraphs under the non-uniform hypergraphinger block model (HSBM)
We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints.
We provide two efficient algorithms which successfully achieve exact recovery when above the threshold, and attain the lowest possible ratio when the exact recovery is impossible.
arXiv Detail & Related papers (2023-04-25T20:30:33Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Sparse random hypergraphs: Non-backtracking spectra and community
detection [10.503525445174464]
We prove that a spectral method based on the non-backtracking operator for hypergraphs works with high probability down to the generalized Kesten-Stigum detection threshold conjectured by Angelini et al.
This is the first provable and efficient algorithm that achieves the conjectured threshold for HSBMs with $r$ blocks generated according to a general symmetric probability tensor.
arXiv Detail & Related papers (2022-03-14T17:45:03Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - Balanced Coarsening for Multilevel Hypergraph Partitioning via
Wasserstein Discrepancy [40.93626211612559]
We propose a balanced coarsening scheme for multilevel hypergraph.
An initial partitioning algorithm is designed to improve the quality of k-way hypergraph.
arXiv Detail & Related papers (2021-06-14T15:30:34Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Exact Recovery in the General Hypergraph Stochastic Block Model [92.28929858529679]
This paper investigates fundamental limits of exact recovery in the general d-uniform hypergraph block model (d-HSBM)
We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below it.
arXiv Detail & Related papers (2021-05-11T03:39:08Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm [71.13558000599839]
We study the problem of solving strongly convex and smooth unconstrained optimization problems using first-order algorithms.
We devise a novel, referred to as Recursive One-Over-T SGD, based on an easily implementable, averaging of past gradients.
We prove that it simultaneously achieves state-of-the-art performance in both a finite-sample, nonasymptotic sense and an sense.
arXiv Detail & Related papers (2020-08-28T14:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.