Related papers: Average Sensitivity of Spectral Clustering

Average Sensitivity of Spectral Clustering

URL: http://arxiv.org/abs/2006.04094v1
Date: Sun, 7 Jun 2020 09:14:44 GMT
Title: Average Sensitivity of Spectral Clustering
Authors: Pan Peng, Yuichi Yoshida
Abstract summary: We study the stability of spectral clustering against edge perturbations in the input graph. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.
Score: 31.283432482502278
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Spectral clustering is one of the most popular clustering methods for finding clusters in a graph, which has found many applications in data mining. However, the input graph in those applications may have many missing edges due to error in measurement, withholding for a privacy reason, or arbitrariness in data conversion. To make reliable and efficient decisions based on spectral clustering, we assess the stability of spectral clustering against edge perturbations in the input graph using the notion of average sensitivity, which is the expected size of the symmetric difference of the output clusters before and after we randomly remove edges. We first prove that the average sensitivity of spectral clustering is proportional to $\lambda_2/\lambda_3^2$, where $\lambda_i$ is the $i$-th smallest eigenvalue of the (normalized) Laplacian. We also prove an analogous bound for $k$-way spectral clustering, which partitions the graph into $k$ clusters. Then, we empirically confirm our theoretical bounds by conducting experiments on synthetic and real networks. Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.

Related papers

On the Robustness of Spectral Algorithms for Semirandom Stochastic Block Models [11.137244714693779]
We study the robustness of spectral algorithms against semirandom adversaries. We identify classes of semirandom adversaries under which spectral bisection using the _unnormalized_ Laplacian is strongly consistent.
arXiv Detail & Related papers (2024-12-18T20:35:02Z)
HeNCler: Node Clustering in Heterophilous Graphs through Learned Asymmetric Similarity [55.27586970082595]
HeNCler is a novel approach for Heterophilous Node Clustering. We show that HeNCler significantly enhances performance in node clustering tasks within heterophilous graph contexts.
arXiv Detail & Related papers (2024-05-27T11:04:05Z)
MeanCut: A Greedy-Optimized Graph Clustering via Path-based Similarity and Degree Descent Criterion [0.6906005491572401]
spectral clustering is popular and attractive due to the remarkable performance, easy implementation, and strong adaptability. We propose MeanCut as the objective function and greedily optimize it in degree descending order for a nondestructive graph partition. The validity of our algorithm is demonstrated by testifying on real-world benchmarks and application of face recognition.
arXiv Detail & Related papers (2023-12-07T06:19:39Z)
Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [79.46465138631592]
We devise an efficient algorithm that recovers clusters using the observed labels. We present Instance-Adaptive Clustering (IAC), the first algorithm whose performance matches these lower bounds both in expectation and with high probability.
arXiv Detail & Related papers (2023-06-18T08:46:06Z)
Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink. By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner. The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss. Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z)
Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation. Specifically, we construct distance matrix between data points by Butterworth filter. To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z)
Spectral Clustering on Large Datasets: When Does it Work? Theory from Continuous Clustering and Density Cheeger-Buser [0.17188280334580194]
In modern machine learning, many data sets are modeled as a large number of points drawn from a probability density function. Our work suggests that Shi-Malik spectral clustering works well on data drawn from mixtures of Laplace distributions. Our key tool is a new Cheeger-Buser inequality for all probability densities, including discontinuous ones.
arXiv Detail & Related papers (2023-05-11T03:08:49Z)
A parameter-free graph reduction for spectral clustering and SpectralNet [1.5469452301122175]
We introduce a graph reduction method that does not require any parameters. First, distances from every point $p$ to its neighbors are filtered using an adaptive threshold to only keep neighbors with similar surrounding density. The edges that survive two steps form the constructed graph that was passed to spectral clustering SpectralNet.
arXiv Detail & Related papers (2023-02-25T21:19:30Z)
Refining a $k$-nearest neighbor graph for a computationally efficient spectral clustering [1.5469452301122175]
approximate spectral clustering (ASC) uses sampling or quantization to select data representatives. We propose a refined version of $k$-nearest neighbor graph, in which we keep data points and aggressively reduce number of edges. Compared to ASC methods, the proposed method delivered a consistent performance despite significant reduction of edges.
arXiv Detail & Related papers (2023-02-22T11:31:32Z)
Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks. Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z)
Improving Spectral Clustering Using Spectrum-Preserving Node Reduction [1.52292571922932]
We use spectrum-preserving node reduction to accelerate eigen-decomposition and generate concise representations of data sets. The experimental results show dramatically improved clustering performance when compared with state-of-the-art methods.
arXiv Detail & Related papers (2021-10-24T01:43:12Z)
Spectral clustering under degree heterogeneity: a case for the random walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree. In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z)
Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA. We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z)
Local Graph Clustering with Network Lasso [90.66817876491052]
We study the statistical and computational properties of a network Lasso method for local graph clustering. The clusters delivered by nLasso can be characterized elegantly via network flows between cluster boundary and seed nodes.
arXiv Detail & Related papers (2020-04-25T17:52:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.