Randomized spectral co-clustering for large-scale directed networks
- URL: http://arxiv.org/abs/2004.12164v3
- Date: Sat, 9 Apr 2022 06:05:26 GMT
- Title: Randomized spectral co-clustering for large-scale directed networks
- Authors: Xiao Guo, Yixuan Qiu, Hai Zhang, Xiangyu Chang
- Abstract summary: Co-clustering aims to cluster the senders and receivers of directed networks simultaneously.
We leverage sketching techniques and derive two randomized spectral co-clustering algorithms.
We establish their approximation error rates and misclustering error rates, indicating better bounds than the state-of-the-art results of co-clustering literature.
- Score: 15.486507430729052
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Directed networks are broadly used to represent asymmetric relationships
among units. Co-clustering aims to cluster the senders and receivers of
directed networks simultaneously. In particular, the well-known spectral
clustering algorithm could be modified as the spectral co-clustering to
co-cluster directed networks. However, large-scale networks pose great
computational challenges to it. In this paper, we leverage sketching techniques
and derive two randomized spectral co-clustering algorithms, one
\emph{random-projection-based} and the other \emph{random-sampling-based}, to
accelerate the co-clustering of large-scale directed networks. We theoretically
analyze the resulting algorithms under two generative models -- the stochastic
co-block model and the degree-corrected stochastic co-block model, and
establish their approximation error rates and misclustering error rates,
indicating better bounds than the state-of-the-art results of co-clustering
literature. Numerically, we design and conduct simulations to support our
theoretical results and test the efficiency of the algorithms on real networks
with up to millions of nodes. A publicly available R package \textsf{RandClust}
is developed for better usability and reproducibility of the proposed methods.
Related papers
- Rethinking Clustered Federated Learning in NOMA Enhanced Wireless
Networks [60.09912912343705]
This study explores the benefits of integrating the novel clustered federated learning (CFL) approach with non-independent and identically distributed (non-IID) datasets.
A detailed theoretical analysis of the generalization gap that measures the degree of non-IID in the data distribution is presented.
Solutions to address the challenges posed by non-IID conditions are proposed with the analysis of the properties.
arXiv Detail & Related papers (2024-03-05T17:49:09Z) - Optimal Clustering of Discrete Mixtures: Binomial, Poisson, Block
Models, and Multi-layer Networks [9.57586103097079]
We study the fundamental limit of clustering networks when a multi-layer network is present.
Under the mixture multi-layer block model (MMSBM), we show that the minimax optimal network clustering error rate takes an exponential form.
We propose a novel two-stage network clustering method including a tensor-based algorithm involving both node and sample splitting.
arXiv Detail & Related papers (2023-11-27T07:48:50Z) - Efficient Bilateral Cross-Modality Cluster Matching for Unsupervised Visible-Infrared Person ReID [56.573905143954015]
We propose a novel bilateral cluster matching-based learning framework to reduce the modality gap by matching cross-modality clusters.
Under such a supervisory signal, a Modality-Specific and Modality-Agnostic (MSMA) contrastive learning framework is proposed to align features jointly at a cluster-level.
Experiments on the public SYSU-MM01 and RegDB datasets demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-05-22T03:27:46Z) - DeepCluE: Enhanced Image Clustering via Multi-layer Ensembles in Deep
Neural Networks [53.88811980967342]
This paper presents a Deep Clustering via Ensembles (DeepCluE) approach.
It bridges the gap between deep clustering and ensemble clustering by harnessing the power of multiple layers in deep neural networks.
Experimental results on six image datasets confirm the advantages of DeepCluE over the state-of-the-art deep clustering approaches.
arXiv Detail & Related papers (2022-06-01T09:51:38Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - Joint Network Topology Inference via Structured Fusion Regularization [70.30364652829164]
Joint network topology inference represents a canonical problem of learning multiple graph Laplacian matrices from heterogeneous graph signals.
We propose a general graph estimator based on a novel structured fusion regularization.
We show that the proposed graph estimator enjoys both high computational efficiency and rigorous theoretical guarantee.
arXiv Detail & Related papers (2021-03-05T04:42:32Z) - KCoreMotif: An Efficient Graph Clustering Algorithm for Large Networks
by Exploiting k-core Decomposition and Motifs [6.1734015475373765]
Spectral clustering is one of the most commonly used algorithms for graph-structured data (networks)
We propose an efficient graph clustering algorithm, KCoreMotif, specifically for large networks by exploiting k-core decomposition and motifs.
Comparative results demonstrate that the proposed graph clustering algorithm is accurate yet efficient for large networks.
arXiv Detail & Related papers (2020-08-21T12:21:05Z) - Local Graph Clustering with Network Lasso [90.66817876491052]
We study the statistical and computational properties of a network Lasso method for local graph clustering.
The clusters delivered by nLasso can be characterized elegantly via network flows between cluster boundary and seed nodes.
arXiv Detail & Related papers (2020-04-25T17:52:05Z) - Motif-Based Spectral Clustering of Weighted Directed Networks [6.1448102196124195]
Clustering is an essential technique for network analysis, with applications in a diverse range of fields.
One approach is to capture and cluster higher-order structures using motif adjacency matrices.
We present new and computationally useful matrix formulae for motif adjacency matrices on weighted networks.
arXiv Detail & Related papers (2020-04-02T22:45:28Z) - Randomized Spectral Clustering in Large-Scale Stochastic Block Models [13.366036495177923]
We study the spectral clustering using randomized sketching algorithms from a statistical perspective.
Under mild conditions, the randomized spectral clustering algorithms lead to the same theoretical bounds as those of the original spectral clustering algorithm.
A new R package called Rclust is developed and made available to the public.
arXiv Detail & Related papers (2020-01-20T04:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.