Strong consistency and optimality of spectral clustering in symmetric
binary non-uniform Hypergraph Stochastic Block Model
- URL: http://arxiv.org/abs/2306.06845v1
- Date: Mon, 12 Jun 2023 03:38:25 GMT
- Title: Strong consistency and optimality of spectral clustering in symmetric
binary non-uniform Hypergraph Stochastic Block Model
- Authors: Haixiao Wang
- Abstract summary: We study the unsupervised classification problem in random hypergraphs under the non-uniform emphHypergraph Block Model (HSBM)
We find that strong consistency is achievable by aggregating information from all uniform layers, even if it is impossible when each layer is considered alone.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Consider the unsupervised classification problem in random hypergraphs under
the non-uniform \emph{Hypergraph Stochastic Block Model} (HSBM) with two
equal-sized communities ($n/2$), where each edge appears independently with
some probability depending only on the labels of its vertices. In this paper,
an \emph{information-theoretical} threshold for strong consistency is
established. Below the threshold, every algorithm would misclassify at least
two vertices with high probability, and the expected \emph{mismatch ratio} of
the eigenvector estimator is upper bounded by $n$ to the power of minus the
threshold. On the other hand, when above the threshold, despite the information
loss induced by tensor contraction, one-stage spectral algorithms assign every
vertex correctly with high probability when only given the contracted adjacency
matrix, even if \emph{semidefinite programming} (SDP) fails in some scenarios.
Moreover, strong consistency is achievable by aggregating information from all
uniform layers, even if it is impossible when each layer is considered alone.
Our conclusions are supported by both theoretical analysis and numerical
experiments.
Related papers
- One-step Bipartite Graph Cut: A Normalized Formulation and Its
Application to Scalable Subspace Clustering [56.81492360414741]
We show how to enforce a one-step normalized cut for bipartite graphs, especially with linear-time complexity.
In this paper, we first characterize a novel one-step bipartite graph cut criterion with normalized constraints, and theoretically prove its equivalence to a trace problem.
We extend this cut criterion to a scalable subspace clustering approach, where adaptive anchor learning, bipartite graph learning, and one-step normalized bipartite graph partitioning are simultaneously modeled.
arXiv Detail & Related papers (2023-05-12T11:27:20Z) - Optimal and exact recovery on general non-uniform Hypergraph Stochastic Block Model [0.0]
We consider the community detection problem in random hypergraphs under the non-uniform hypergraphinger block model (HSBM)
We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints.
We provide two efficient algorithms which successfully achieve exact recovery when above the threshold, and attain the lowest possible ratio when the exact recovery is impossible.
arXiv Detail & Related papers (2023-04-25T20:30:33Z) - Semi-Supervised Clustering of Sparse Graphs: Crossing the
Information-Theoretic Threshold [3.6052935394000234]
Block model is a canonical random graph model for clustering and community detection on network-structured data.
No estimator based on the network topology can perform substantially better than chance on sparse graphs if the model parameter is below a certain threshold.
We prove that with an arbitrary fraction of the labels feasible throughout the parameter domain.
arXiv Detail & Related papers (2022-05-24T00:03:25Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Geometric Graph Representation Learning via Maximizing Rate Reduction [73.6044873825311]
Learning node representations benefits various downstream tasks in graph analysis such as community detection and node classification.
We propose Geometric Graph Representation Learning (G2R) to learn node representations in an unsupervised manner.
G2R maps nodes in distinct groups into different subspaces, while each subspace is compact and different subspaces are dispersed.
arXiv Detail & Related papers (2022-02-13T07:46:24Z) - Partial recovery and weak consistency in the non-uniform hypergraph Stochastic Block Model [6.681901523019242]
We consider the community detection problem in random hypergraphs under the nonuniform hypergraph block model (HSBM)
We provide a spectral algorithm that outputs a partition with at least a $gamma$ fraction of the vertices classified correctly, where $gammain depends on the signal-to-noise ratio (SNR) of the model.
The theoretical analysis of our algorithm relies on the concentration and regularization of the adjacency matrix for non-uniform random hypergraphs, which can be of independent interest.
arXiv Detail & Related papers (2021-12-22T05:38:33Z) - Exact Recovery in the General Hypergraph Stochastic Block Model [92.28929858529679]
This paper investigates fundamental limits of exact recovery in the general d-uniform hypergraph block model (d-HSBM)
We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below it.
arXiv Detail & Related papers (2021-05-11T03:39:08Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins [92.7662890047311]
We show that gradient descent finds halfspaces with classification error $tilde O(mathsfOPT1/2) + varepsilon$ in $mathrmpoly(d,1/varepsilon)$ time and sample complexity.
arXiv Detail & Related papers (2020-10-01T16:48:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.