Exact Recovery in the General Hypergraph Stochastic Block Model
- URL: http://arxiv.org/abs/2105.04770v1
- Date: Tue, 11 May 2021 03:39:08 GMT
- Title: Exact Recovery in the General Hypergraph Stochastic Block Model
- Authors: Qiaosheng Zhang, Vincent Y. F. Tan
- Abstract summary: This paper investigates fundamental limits of exact recovery in the general d-uniform hypergraph block model (d-HSBM)
We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below it.
- Score: 92.28929858529679
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This paper investigates fundamental limits of exact recovery in the general
d-uniform hypergraph stochastic block model (d-HSBM), wherein n nodes are
partitioned into k disjoint communities with relative sizes (p1,..., pk). Each
subset of nodes with cardinality d is generated independently as an order-d
hyperedge with a certain probability that depends on the ground-truth
communities that the d nodes belong to. The goal is to exactly recover the k
hidden communities based on the observed hypergraph. We show that there exists
a sharp threshold such that exact recovery is achievable above the threshold
and impossible below the threshold (apart from a small regime of parameters
that will be specified precisely). This threshold is represented in terms of a
quantity which we term as the generalized Chernoff-Hellinger divergence between
communities. Our result for this general model recovers prior results for the
standard SBM and d-HSBM with two symmetric communities as special cases. En
route to proving our achievability results, we develop a polynomial-time
two-stage algorithm that meets the threshold. The first stage adopts a certain
hypergraph spectral clustering method to obtain a coarse estimate of
communities, and the second stage refines each node individually via local
refinement steps to ensure exact recovery.
Related papers
- Community Detection in the Multi-View Stochastic Block Model [47.43048484980115]
This paper considers the problem of community detection on multiple potentially correlated graphs from antheoretical perspective.
We first put forth a random graph model, called the multi-view information block model (MVSBM), designed to generate correlated graphs on the same set of nodes.
arXiv Detail & Related papers (2024-01-17T13:39:38Z) - Optimal and exact recovery on general non-uniform Hypergraph Stochastic Block Model [0.0]
We consider the community detection problem in random hypergraphs under the non-uniform hypergraphinger block model (HSBM)
We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints.
We provide two efficient algorithms which successfully achieve exact recovery when above the threshold, and attain the lowest possible ratio when the exact recovery is impossible.
arXiv Detail & Related papers (2023-04-25T20:30:33Z) - Multilayer hypergraph clustering using the aggregate similarity matrix [0.7373617024876725]
We consider the community recovery problem on a multilayer variant of the hypergraph block model (HSBM)
In this work, we investigate a semidefinite programming (SDP) approach and obtain information-theoretic conditions on the model parameters that guarantee exact recovery.
arXiv Detail & Related papers (2023-01-27T11:15:46Z) - Exact Recovery of Community Detection in dependent Gaussian Mixture
Models [5.312472550578279]
We study the community detection problem on a Gaussian mixture model.
We prove necessary and sufficient conditions for the exact recovery of the maximum likelihood estimation.
arXiv Detail & Related papers (2022-09-23T14:26:19Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Density Ratio Estimation via Infinitesimal Classification [85.08255198145304]
We propose DRE-infty, a divide-and-conquer approach to reduce Density ratio estimation (DRE) to a series of easier subproblems.
Inspired by Monte Carlo methods, we smoothly interpolate between the two distributions via an infinite continuum of intermediate bridge distributions.
We show that our approach performs well on downstream tasks such as mutual information estimation and energy-based modeling on complex, high-dimensional datasets.
arXiv Detail & Related papers (2021-11-22T06:26:29Z) - Correlation Clustering Reconstruction in Semi-Adversarial Models [70.11015369368272]
Correlation Clustering is an important clustering problem with many applications.
We study the reconstruction version of this problem in which one is seeking to reconstruct a latent clustering corrupted by random noise and adversarial modifications.
arXiv Detail & Related papers (2021-08-10T14:46:17Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - Non-Convex Exact Community Recovery in Stochastic Block Model [31.221745716673546]
Community detection in graphs that are generated according to symmetric block models (SBMs) has received much attention lately.
We show that in the logarithmic sparsity regime of the problem, with high probability the proposed two-stage method can exactly recover the two communities down to the information-theoretic limit in $mathcalO(nlog2n/loglog n)$ time.
We also conduct numerical experiments on both synthetic and real data sets to demonstrate the efficacy of our proposed method and complement our theoretical development.
arXiv Detail & Related papers (2020-06-29T07:03:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.