k-Factorization Subspace Clustering
- URL: http://arxiv.org/abs/2012.04345v1
- Date: Tue, 8 Dec 2020 10:34:21 GMT
- Title: k-Factorization Subspace Clustering
- Authors: Jicong Fan
- Abstract summary: Subspace clustering aims to cluster data lying in a union of low-dimensional subspaces.
This paper presents a method called k-Factorization Subspace Clustering (k-FSC) for large-scale subspace clustering.
- Score: 12.18340575383456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Subspace clustering (SC) aims to cluster data lying in a union of
low-dimensional subspaces. Usually, SC learns an affinity matrix and then
performs spectral clustering. Both steps suffer from high time and space
complexity, which leads to difficulty in clustering large datasets. This paper
presents a method called k-Factorization Subspace Clustering (k-FSC) for
large-scale subspace clustering. K-FSC directly factorizes the data into k
groups via pursuing structured sparsity in the matrix factorization model.
Thus, k-FSC avoids learning affinity matrix and performing eigenvalue
decomposition, and hence has low time and space complexity on large datasets.
An efficient algorithm is proposed to solve the optimization of k-FSC. In
addition, k-FSC is able to handle noise, outliers, and missing data and
applicable to arbitrarily large datasets and streaming data. Extensive
experiments show that k-FSC outperforms state-of-the-art subspace clustering
methods.
Related papers
- Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks.
We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z) - Datacube segmentation via Deep Spectral Clustering [76.48544221010424]
Extended Vision techniques often pose a challenge in their interpretation.
The huge dimensionality of data cube spectra poses a complex task in its statistical interpretation.
In this paper, we explore the possibility of applying unsupervised clustering methods in encoded space.
A statistical dimensional reduction is performed by an ad hoc trained (Variational) AutoEncoder, while the clustering process is performed by a (learnable) iterative K-Means clustering algorithm.
arXiv Detail & Related papers (2024-01-31T09:31:28Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Enhancing cluster analysis via topological manifold learning [0.3823356975862006]
We show that inferring the topological structure of a dataset before clustering can considerably enhance cluster detection.
We combine manifold learning method UMAP for inferring the topological structure with density-based clustering method DBSCAN.
arXiv Detail & Related papers (2022-07-01T15:53:39Z) - Fast and explainable clustering based on sorting [0.0]
We introduce a fast and explainable clustering method called CLASSIX.
The algorithm is controlled by two scalar parameters, namely a distance parameter for the aggregation and another parameter controlling the minimal cluster size.
Our experiments demonstrate that CLASSIX competes with state-of-the-art clustering algorithms.
arXiv Detail & Related papers (2022-02-03T08:24:21Z) - Very Compact Clusters with Structural Regularization via Similarity and
Connectivity [3.779514860341336]
We propose an end-to-end deep clustering algorithm, i.e., Very Compact Clusters (VCC) for the general datasets.
Our proposed approach achieves better clustering performance over most of the state-of-the-art clustering methods.
arXiv Detail & Related papers (2021-06-09T23:22:03Z) - Overcomplete Deep Subspace Clustering Networks [80.16644725886968]
Experimental results on four benchmark datasets show the effectiveness of the proposed method over DSC and other clustering methods in terms of clustering error.
Our method is also not as dependent as DSC is on where pre-training should be stopped to get the best performance and is also more robust to noise.
arXiv Detail & Related papers (2020-11-16T22:07:18Z) - Scalable Hierarchical Agglomerative Clustering [65.66407726145619]
Existing scalable hierarchical clustering methods sacrifice quality for speed.
We present a scalable, agglomerative method for hierarchical clustering that does not sacrifice quality and scales to billions of data points.
arXiv Detail & Related papers (2020-10-22T15:58:35Z) - Graph Convolutional Subspace Clustering: A Robust Subspace Clustering
Framework for Hyperspectral Image [6.332208511335129]
We present a novel subspace clustering framework called Graph Convolutional Subspace Clustering (GCSC) for robust HSI clustering.
Specifically, the framework recasts the self-expressiveness property of the data into the non-Euclidean domain.
We show that traditional subspace clustering models are the special forms of our framework with the Euclidean data.
arXiv Detail & Related papers (2020-04-22T10:09:19Z) - Learnable Subspace Clustering [76.2352740039615]
We develop a learnable subspace clustering paradigm to efficiently solve the large-scale subspace clustering problem.
The key idea is to learn a parametric function to partition the high-dimensional subspaces into their underlying low-dimensional subspaces.
To the best of our knowledge, this paper is the first work to efficiently cluster millions of data points among the subspace clustering methods.
arXiv Detail & Related papers (2020-04-09T12:53:28Z) - Fast Kernel k-means Clustering Using Incomplete Cholesky Factorization [11.631064399465089]
Kernel-based clustering algorithm can identify and capture the non-linear structure in datasets.
It can achieve better performance than linear clustering.
computing and storing the entire kernel matrix occupy so large memory that it is difficult for kernel-based clustering to deal with large-scale datasets.
arXiv Detail & Related papers (2020-02-07T15:32:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.