Consistent Spectral Clustering in Hyperbolic Spaces
- URL: http://arxiv.org/abs/2409.09304v1
- Date: Sat, 14 Sep 2024 04:54:31 GMT
- Title: Consistent Spectral Clustering in Hyperbolic Spaces
- Authors: Sagar Ghosh, Swagatam Das,
- Abstract summary: We propose a spectral clustering algorithm on Hyperbolic Spaces to represent complex data structures.
We show that our algorithm converges at least as fast as Spectral Clustering on Euclidean Spaces.
This work opens up avenues for utilizing non-Euclidean Spaces in clustering algorithms.
- Score: 16.75089998678061
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Clustering, as an unsupervised technique, plays a pivotal role in various data analysis applications. Among clustering algorithms, Spectral Clustering on Euclidean Spaces has been extensively studied. However, with the rapid evolution of data complexity, Euclidean Space is proving to be inefficient for representing and learning algorithms. Although Deep Neural Networks on hyperbolic spaces have gained recent traction, clustering algorithms or non-deep machine learning models on non-Euclidean Spaces remain underexplored. In this paper, we propose a spectral clustering algorithm on Hyperbolic Spaces to address this gap. Hyperbolic Spaces offer advantages in representing complex data structures like hierarchical and tree-like structures, which cannot be embedded efficiently in Euclidean Spaces. Our proposed algorithm replaces the Euclidean Similarity Matrix with an appropriate Hyperbolic Similarity Matrix, demonstrating improved efficiency compared to clustering in Euclidean Spaces. Our contributions include the development of the spectral clustering algorithm on Hyperbolic Spaces and the proof of its weak consistency. We show that our algorithm converges at least as fast as Spectral Clustering on Euclidean Spaces. To illustrate the efficacy of our approach, we present experimental results on the Wisconsin Breast Cancer Dataset, highlighting the superior performance of Hyperbolic Spectral Clustering over its Euclidean counterpart. This work opens up avenues for utilizing non-Euclidean Spaces in clustering algorithms, offering new perspectives for handling complex data structures and improving clustering efficiency.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Spectral Clustering in Convex and Constrained Settings [0.0]
We introduce a novel framework for seamlessly integrating pairwise constraints into semidefinite spectral clustering.
Our methodology systematically extends the capabilities of semidefinite spectral clustering to capture complex data structures.
arXiv Detail & Related papers (2024-04-03T18:50:14Z) - Toward Efficient and Incremental Spectral Clustering via Parametric
Spectral Clustering [2.44755919161855]
Spectral clustering is a popular method for effectively clustering nonlinearly separable data.
This paper introduces a novel approach called parametric spectral clustering (PSC)
PSC addresses the challenges associated with big data and real-time scenarios.
arXiv Detail & Related papers (2023-11-14T01:26:20Z) - Adaptively-weighted Integral Space for Fast Multiview Clustering [54.177846260063966]
We propose an Adaptively-weighted Integral Space for Fast Multiview Clustering (AIMC) with nearly linear complexity.
Specifically, view generation models are designed to reconstruct the view observations from the latent integral space.
Experiments conducted on several realworld datasets confirm the superiority of the proposed AIMC method.
arXiv Detail & Related papers (2022-08-25T05:47:39Z) - Enhancing cluster analysis via topological manifold learning [0.3823356975862006]
We show that inferring the topological structure of a dataset before clustering can considerably enhance cluster detection.
We combine manifold learning method UMAP for inferring the topological structure with density-based clustering method DBSCAN.
arXiv Detail & Related papers (2022-07-01T15:53:39Z) - Perfect Spectral Clustering with Discrete Covariates [68.8204255655161]
We propose a spectral algorithm that achieves perfect clustering with high probability on a class of large, sparse networks.
Our method is the first to offer a guarantee of consistent latent structure recovery using spectral clustering.
arXiv Detail & Related papers (2022-05-17T01:41:06Z) - Provably Accurate and Scalable Linear Classifiers in Hyperbolic Spaces [39.71927912296049]
We propose a unified framework for learning scalable and simple hyperbolic linear classifiers.
The gist of our approach is to focus on Poincar'e ball models and formulate the classification problems using tangent space formalisms.
The excellent performance of the Poincar'e second-order and strategic perceptrons shows that the proposed framework can be extended to general machine learning problems in hyperbolic spaces.
arXiv Detail & Related papers (2022-03-07T21:36:21Z) - Overcomplete Deep Subspace Clustering Networks [80.16644725886968]
Experimental results on four benchmark datasets show the effectiveness of the proposed method over DSC and other clustering methods in terms of clustering error.
Our method is also not as dependent as DSC is on where pre-training should be stopped to get the best performance and is also more robust to noise.
arXiv Detail & Related papers (2020-11-16T22:07:18Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Non-Exhaustive, Overlapping Co-Clustering: An Extended Analysis [32.15852903039789]
The goal of co-clustering is to simultaneously identify a clustering of rows as well as columns of a two dimensional data matrix.
We develop an efficient iterative algorithm which we call the NEO-CC algorithm.
Experimental results show that the NEO-CC algorithm is able to effectively capture the underlying co-clustering structure of real-world data.
arXiv Detail & Related papers (2020-04-24T04:39:14Z) - Learnable Subspace Clustering [76.2352740039615]
We develop a learnable subspace clustering paradigm to efficiently solve the large-scale subspace clustering problem.
The key idea is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces.
To the best of our knowledge, this paper is the first work to efficiently cluster millions of data points among the subspace clustering methods.
arXiv Detail & Related papers (2020-04-09T12:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.