On the Power of SVD in the Stochastic Block Model
- URL: http://arxiv.org/abs/2309.15322v1
- Date: Wed, 27 Sep 2023 00:04:27 GMT
- Title: On the Power of SVD in the Stochastic Block Model
- Authors: Xinyu Mao and Jiapeng Zhang
- Abstract summary: spectral-based dimensionality reduction tools, such as PCA or SVD, improve the performance of clustering algorithms in many applications.
We show that, in the symmetric setting, vanilla-SVD algorithm recovers all clusters correctly.
- Score: 6.661713888517129
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A popular heuristic method for improving clustering results is to apply
dimensionality reduction before running clustering algorithms. It has been
observed that spectral-based dimensionality reduction tools, such as PCA or
SVD, improve the performance of clustering algorithms in many applications.
This phenomenon indicates that spectral method not only serves as a
dimensionality reduction tool, but also contributes to the clustering procedure
in some sense. It is an interesting question to understand the behavior of
spectral steps in clustering problems.
As an initial step in this direction, this paper studies the power of
vanilla-SVD algorithm in the stochastic block model (SBM). We show that, in the
symmetric setting, vanilla-SVD algorithm recovers all clusters correctly. This
result answers an open question posed by Van Vu (Combinatorics Probability and
Computing, 2018) in the symmetric setting.
Related papers
- An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - A Restarted Large-Scale Spectral Clustering with Self-Guiding and Block
Diagonal Representation [1.115905690697198]
We propose a restarted clustering framework with self-guiding and block diagonal representation.
An advantage of the strategy is that some useful clustering information obtained from previous cycles could be preserved.
Theoretical results are established to show the rationality of inexact computations in spectral clustering.
arXiv Detail & Related papers (2023-06-27T01:38:52Z) - Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels.
We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - A Distributed Block Chebyshev-Davidson Algorithm for Parallel Spectral
Clustering [7.632758664232665]
We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering.
The efficiency of the Chebyshev-Davidson algorithm relies on the prior knowledge of the eigenvalue spectrum, which could be expensive to estimate.
A distributed and parallel version has been developed with attractive scalability.
arXiv Detail & Related papers (2022-12-08T18:06:13Z) - Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning.
Recent work has established lower bounds against the class of low-degree methods.
We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z) - Parallel Stochastic Mirror Descent for MDPs [72.75921150912556]
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs)
Some variant of Mirror Descent is proposed for convex programming problems with Lipschitz-continuous functionals.
We analyze this algorithm in a general case and obtain an estimate of the convergence rate that does not accumulate errors during the operation of the method.
arXiv Detail & Related papers (2021-02-27T19:28:39Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Spectral Clustering using Eigenspectrum Shape Based Nystrom Sampling [19.675277307158435]
This paper proposes a scalable Nystrom-based clustering algorithm with a new sampling procedure, Centroid Minimum Sum of Squared Similarities (CMS3), and a on when to use it.
Our datasets depends on the eigen spectrum shape of the dataset, and yields competitive low-rank approximations in test compared to the other state-of-the-art methods.
arXiv Detail & Related papers (2020-07-21T17:49:03Z) - An Efficient Smoothing Proximal Gradient Algorithm for Convex Clustering [2.5182813818441945]
Recently introduced convex clustering approach formulates clustering as a convex optimization problem.
State-of-the-art convex clustering algorithms require large computation and memory space.
In this paper, we develop a very efficient smoothing gradient algorithm (Sproga) for convex clustering.
arXiv Detail & Related papers (2020-06-22T20:02:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.