Non-Convex Exact Community Recovery in Stochastic Block Model
- URL: http://arxiv.org/abs/2006.15843v4
- Date: Mon, 27 Sep 2021 12:03:04 GMT
- Title: Non-Convex Exact Community Recovery in Stochastic Block Model
- Authors: Peng Wang, Zirui Zhou, Anthony Man-Cho So
- Abstract summary: Community detection in graphs that are generated according to symmetric block models (SBMs) has received much attention lately.
We show that in the logarithmic sparsity regime of the problem, with high probability the proposed two-stage method can exactly recover the two communities down to the information-theoretic limit in $mathcalO(nlog2n/loglog n)$ time.
We also conduct numerical experiments on both synthetic and real data sets to demonstrate the efficacy of our proposed method and complement our theoretical development.
- Score: 31.221745716673546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Community detection in graphs that are generated according to stochastic
block models (SBMs) has received much attention lately. In this paper, we focus
on the binary symmetric SBM -- in which a graph of $n$ vertices is randomly
generated by first partitioning the vertices into two equal-sized communities
and then connecting each pair of vertices with probability that depends on
their community memberships -- and study the associated exact community
recovery problem. Although the maximum-likelihood formulation of the problem is
non-convex and discrete, we propose to tackle it using a popular iterative
method called projected power iterations. To ensure fast convergence of the
method, we initialize it using a point that is generated by another iterative
method called orthogonal iterations, which is a classic method for computing
invariant subspaces of a symmetric matrix. We show that in the logarithmic
sparsity regime of the problem, with high probability the proposed two-stage
method can exactly recover the two communities down to the
information-theoretic limit in $\mathcal{O}(n\log^2n/\log\log n)$ time, which
is competitive with a host of existing state-of-the-art methods that have the
same recovery performance. We also conduct numerical experiments on both
synthetic and real data sets to demonstrate the efficacy of our proposed method
and complement our theoretical development.
Related papers
- Representation and De-interleaving of Mixtures of Hidden Markov Processes [3.7348616912887445]
De-interleaving of mixtures of Hidden Markov Processes (HMPs) generally depends on its representation model.
This paper proposes a novel representation model and corresponding de-interleaving methods for the mixtures of HMPs.
arXiv Detail & Related papers (2024-06-01T12:24:23Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Semi-Supervised Subspace Clustering via Tensor Low-Rank Representation [64.49871502193477]
We propose a novel semi-supervised subspace clustering method, which is able to simultaneously augment the initial supervisory information and construct a discriminative affinity matrix.
Comprehensive experimental results on six commonly-used benchmark datasets demonstrate the superiority of our method over state-of-the-art methods.
arXiv Detail & Related papers (2022-05-21T01:47:17Z) - Exact Community Recovery over Signed Graphs [27.2776492470422]
We study the problem of community recovery over signed graphs with two equal-sized communities.
Our approach is based on the maximum likelihood estimation (MLE) of the signed block model.
It is shown that in the logarithmic degree regime, the proposed algorithm can exactly recover the underlying communities in nearly-linear time.
arXiv Detail & Related papers (2022-02-22T05:03:25Z) - A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective.
We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z) - Lifting the Convex Conjugate in Lagrangian Relaxations: A Tractable
Approach for Continuous Markov Random Fields [53.31927549039624]
We show that a piecewise discretization preserves better contrast from existing discretization problems.
We apply this theory to the problem of matching two images.
arXiv Detail & Related papers (2021-07-13T12:31:06Z) - Improving Metric Dimensionality Reduction with Distributed Topology [68.8204255655161]
DIPOLE is a dimensionality-reduction post-processing step that corrects an initial embedding by minimizing a loss functional with both a local, metric term and a global, topological term.
We observe that DIPOLE outperforms popular methods like UMAP, t-SNE, and Isomap on a number of popular datasets.
arXiv Detail & Related papers (2021-06-14T17:19:44Z) - Free Energy Node Embedding via Generalized Skip-gram with Negative
Sampling [34.12821919995092]
We propose improvements in two steps of the unsupervised node embedding framework.
On the one hand, we propose to encode node similarities based on the free energy distance, which interpolates between the shortest path and the commute time distances.
On the other hand, we propose a matrix factorization method based on a loss function that generalizes to arbitrary similarity matrices.
arXiv Detail & Related papers (2021-05-19T14:58:13Z) - Exact Recovery in the General Hypergraph Stochastic Block Model [92.28929858529679]
This paper investigates fundamental limits of exact recovery in the general d-uniform hypergraph block model (d-HSBM)
We show that there exists a sharp threshold such that exact recovery is achievable above the threshold and impossible below it.
arXiv Detail & Related papers (2021-05-11T03:39:08Z) - Jointly Modeling and Clustering Tensors in High Dimensions [6.072664839782975]
We consider the problem of jointly benchmarking and clustering of tensors.
We propose an efficient high-maximization algorithm that converges geometrically to a neighborhood that is within statistical precision.
arXiv Detail & Related papers (2021-04-15T21:06:16Z) - Community Detection in the Stochastic Block Model by Mixed Integer
Programming [3.8073142980733]
Degree-Corrected Block Model (DCSBM) is a popular model to generate random graphs with community structure given an expected degree sequence.
Standard approach of community detection based on the DCSBM is to search for the model parameters that are the most likely to have produced the observed network data through maximum likelihood estimation (MLE)
We present mathematical programming formulations and exact solution methods that can provably find the model parameters and community assignments of maximum likelihood given an observed graph.
arXiv Detail & Related papers (2021-01-26T22:04:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.