Structure-Aware Spectral Sparsification via Uniform Edge Sampling
- URL: http://arxiv.org/abs/2510.12669v1
- Date: Tue, 14 Oct 2025 16:05:02 GMT
- Title: Structure-Aware Spectral Sparsification via Uniform Edge Sampling
- Authors: Kaiwen He, Petros Drineas, Rajiv Khanna,
- Abstract summary: We study whether uniform edge sampling-a simple, structure-agnostic strategy-can suffice for spectral clustering.<n>We prove that uniformly sampling $O(gamma2 n log n / epsilon2)$ edges, where $gamma$ is the Laplacian condition number, yields a sparsifier whose top $(n-k)$-dimensional eigenspace is approximately to the cluster indicators.
- Score: 12.266977096773472
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spectral clustering is a fundamental method for graph partitioning, but its reliance on eigenvector computation limits scalability to massive graphs. Classical sparsification methods preserve spectral properties by sampling edges proportionally to their effective resistances, but require expensive preprocessing to estimate these resistances. We study whether uniform edge sampling-a simple, structure-agnostic strategy-can suffice for spectral clustering. Our main result shows that for graphs admitting a well-separated $k$-clustering, characterized by a large structure ratio $\Upsilon(k) = \lambda_{k+1} / \rho_G(k)$, uniform sampling preserves the spectral subspace used for clustering. Specifically, we prove that uniformly sampling $O(\gamma^2 n \log n / \epsilon^2)$ edges, where $\gamma$ is the Laplacian condition number, yields a sparsifier whose top $(n-k)$-dimensional eigenspace is approximately orthogonal to the cluster indicators. This ensures that the spectral embedding remains faithful, and clustering quality is preserved. Our analysis introduces new resistance bounds for intra-cluster edges, a rank-$(n-k)$ effective resistance formulation, and a matrix Chernoff bound adapted to the dominant eigenspace. These tools allow us to bypass importance sampling entirely. Conceptually, our result connects recent coreset-based clustering theory to spectral sparsification, showing that under strong clusterability, even uniform sampling is structure-aware. This provides the first provable guarantee that uniform edge sampling suffices for structure-preserving spectral clustering.
Related papers
- Graph-based Clustering Revisited: A Relaxation of Kernel $k$-Means Perspective [73.18641268511318]
We propose a graph-based clustering algorithm that only relaxes the orthonormal constraint to derive clustering results.<n>To ensure a doubly constraint into a gradient, we transform the non-negative constraint into a class probability parameter.
arXiv Detail & Related papers (2025-09-23T09:14:39Z) - Scalable Context-Preserving Model-Aware Deep Clustering for Hyperspectral Images [51.95768218975529]
Subspace clustering has become widely adopted for the unsupervised analysis of hyperspectral images (HSIs)<n>Recent model-aware deep subspace clustering methods often use a two-stage framework, involving the calculation of a self-representation matrix with complexity of O(n2), followed by spectral clustering.<n>We propose a scalable, context-preserving deep clustering method based on basis representation, which jointly captures local and non-local structures for efficient HSI clustering.
arXiv Detail & Related papers (2025-06-12T16:43:09Z) - Near-Optimal Clustering in Mixture of Markov Chains [74.3828414695655]
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$.<n>We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains.<n>We then present a novel two-stage clustering algorithm.
arXiv Detail & Related papers (2025-06-02T05:10:40Z) - On the Robustness of Spectral Algorithms for Semirandom Stochastic Block Models [11.137244714693779]
We study the robustness of spectral algorithms against semirandom adversaries.<n>We identify classes of semirandom adversaries under which spectral bisection using the _unnormalized_ Laplacian is strongly consistent.
arXiv Detail & Related papers (2024-12-18T20:35:02Z) - Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model [85.51611950757643]
We propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability.<n>IAC maintains an overall computational complexity of $ mathcalO(n, textpolylog(n) $, making it scalable and practical for large-scale problems.
arXiv Detail & Related papers (2023-06-18T08:46:06Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - On consistency of constrained spectral clustering under
representation-aware stochastic block model [20.6072287343024]
We study constrained spectral clustering with the aim of finding balanced clusters in a given textitsimilarity graph $mathcalG$.
We develop unnormalized and normalized variants of spectral clustering in this setting.
These algorithms use $mathcalR$ to find clusters in $mathcalG$ that approximately satisfy the proposed constraint.
arXiv Detail & Related papers (2022-03-03T20:41:14Z) - An $\ell_p$ theory of PCA and spectral clustering [23.90245234027558]
Principal Component Analysis is a powerful tool in statistics and machine learning.
In this paper, we develop an $ell_p$ theory for a hollowed version of PCA in Hilbert spaces.
For contextual community detection, the $ell_p$ theory leads to a simple spectral algorithm that achieves the information threshold for exact recovery.
arXiv Detail & Related papers (2020-06-24T21:30:28Z) - Average Sensitivity of Spectral Clustering [31.283432482502278]
We study the stability of spectral clustering against edge perturbations in the input graph.
Our results suggest that spectral clustering is stable against edge perturbations when there is a cluster structure in the input graph.
arXiv Detail & Related papers (2020-06-07T09:14:44Z) - Local Graph Clustering with Network Lasso [90.66817876491052]
We study the statistical and computational properties of a network Lasso method for local graph clustering.
The clusters delivered by nLasso can be characterized elegantly via network flows between cluster boundary and seed nodes.
arXiv Detail & Related papers (2020-04-25T17:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.