Toward Efficient and Incremental Spectral Clustering via Parametric
Spectral Clustering
- URL: http://arxiv.org/abs/2311.07833v1
- Date: Tue, 14 Nov 2023 01:26:20 GMT
- Title: Toward Efficient and Incremental Spectral Clustering via Parametric
Spectral Clustering
- Authors: Jo-Chun Chen, Hung-Hsuan Chen
- Abstract summary: Spectral clustering is a popular method for effectively clustering nonlinearly separable data.
This paper introduces a novel approach called parametric spectral clustering (PSC)
PSC addresses the challenges associated with big data and real-time scenarios.
- Score: 2.44755919161855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spectral clustering is a popular method for effectively clustering
nonlinearly separable data. However, computational limitations, memory
requirements, and the inability to perform incremental learning challenge its
widespread application. To overcome these limitations, this paper introduces a
novel approach called parametric spectral clustering (PSC). By extending the
capabilities of spectral clustering, PSC addresses the challenges associated
with big data and real-time scenarios and enables efficient incremental
clustering with new data points. Experimental evaluations conducted on various
open datasets demonstrate the superiority of PSC in terms of computational
efficiency while achieving clustering quality mostly comparable to standard
spectral clustering. The proposed approach has significant potential for
incremental and real-time data analysis applications, facilitating timely and
accurate clustering in dynamic and evolving datasets. The findings of this
research contribute to the advancement of clustering techniques and open new
avenues for efficient and effective data analysis. We publish the experimental
code at https://github.com/109502518/PSC_BigData.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - A3S: A General Active Clustering Method with Pairwise Constraints [66.74627463101837]
A3S features strategic active clustering adjustment on the initial cluster result, which is obtained by an adaptive clustering algorithm.
In extensive experiments across diverse real-world datasets, A3S achieves desired results with significantly fewer human queries.
arXiv Detail & Related papers (2024-07-14T13:37:03Z) - GCC: Generative Calibration Clustering [55.44944397168619]
We propose a novel Generative Clustering (GCC) method to incorporate feature learning and augmentation into clustering procedure.
First, we develop a discrimirative feature alignment mechanism to discover intrinsic relationship across real and generated samples.
Second, we design a self-supervised metric learning to generate more reliable cluster assignment.
arXiv Detail & Related papers (2024-04-14T01:51:11Z) - Spectral Clustering in Convex and Constrained Settings [0.0]
We introduce a novel framework for seamlessly integrating pairwise constraints into semidefinite spectral clustering.
Our methodology systematically extends the capabilities of semidefinite spectral clustering to capture complex data structures.
arXiv Detail & Related papers (2024-04-03T18:50:14Z) - Detection and Evaluation of Clusters within Sequential Data [58.720142291102135]
Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees.
In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets.
It is found that the Block Markov Chain model assumption can indeed produce meaningful insights in exploratory data analyses.
arXiv Detail & Related papers (2022-10-04T15:22:39Z) - Augmented Data as an Auxiliary Plug-in Towards Categorization of
Crowdsourced Heritage Data [2.609784101826762]
We propose a strategy to mitigate the problem of inefficient clustering performance by introducing data augmentation as an auxiliary plug-in.
We train a variant of Convolutional Autoencoder (CAE) with augmented data to construct the initial feature space as a novel model for deep clustering.
arXiv Detail & Related papers (2021-07-08T14:09:39Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Scalable Spectral Clustering with Nystrom Approximation: Practical and
Theoretical Aspects [1.6752182911522515]
This work presents a principled spectral clustering algorithm that exploits spectral properties of the similarity matrix associated with sampled points to regulate accuracy-efficiency trade-offs.
The overarching goal of this work is to provide an improved baseline for future research directions to accelerate spectral clustering.
arXiv Detail & Related papers (2020-06-25T15:10:56Z) - New advances in enumerative biclustering algorithms with online
partitioning [80.22629846165306]
This paper further extends RIn-Close_CVC, a biclustering algorithm capable of performing an efficient, complete, correct and non-redundant enumeration of maximal biclusters with constant values on columns in numerical datasets.
The improved algorithm is called RIn-Close_CVC3, keeps those attractive properties of RIn-Close_CVC, and is characterized by: a drastic reduction in memory usage; a consistent gain in runtime.
arXiv Detail & Related papers (2020-03-07T14:54:26Z) - Autoencoder-based time series clustering with energy applications [0.0]
Time series clustering is a challenging task due to the specific nature of the data.
In this paper we investigate the combination of a convolutional autoencoder and a k-medoids algorithm to perfom time series clustering.
arXiv Detail & Related papers (2020-02-10T10:04:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.